Python 2.1 alpha 2 released

Fri, 2 Feb 2001 18:40:08 -0500 (EST)

While Guido is working the press circuit at the LinuxWorld Expo in New
York City, the Python developers, including the many volunteers and
the folks from PythonLabs, were busy finishing the second alpha
release of Python 2.1.

The release is currently available from SourceForge and will also be
available from python.org later today.  You can find the source
release at:

    http://sourceforge.net/project/showfiles.php?group_id=5470

The Windows installer will be ready shortly.

Fred Drake announced the documentation release earlier today.  You can
browse the new docs online at

    http://python.sourceforge.net/devel-docs/

or download them from

    ftp://ftp.python.org/pub/python/doc/2.1a2/

Please give it a good try!  The only way Python 2.1 can become a
rock-solid product is if people test the alpha releases.  If you are
using Python for demanding applications or on extreme platforms, we
are particularly interested in hearing your feedback.  Are you
embedding Python or using threads?  Please test your application using
Python 2.1a2!  Please submit all bug reports through SourceForge:

  http://sourceforge.net/bugs/?group_id=5470

Here's the NEWS file:

What's New in Python 2.1 alpha 2?
=================================

Core language, builtins, and interpreter

- Scopes nest.  If a name is used in a function or class, but is not
  local, the definition in the nearest enclosing function scope will
  be used.  One consequence of this change is that lambda statements
  could reference variables in the namespaces where the lambda is
  defined.  In some unusual cases, this change will break code.

  In all previous version of Python, names were resolved in exactly
  three namespaces -- the local namespace, the global namespace, and
  the builtin namespace.  According to this old definition, if a
  function A is defined within a function B, the names bound in B are
  not visible in A.  The new rules make names bound in B visible in A,
  unless A contains a name binding that hides the binding in B.

  Section 4.1 of the reference manual describes the new scoping rules
  in detail.  The test script in Lib/test/test_scope.py demonstrates
  some of the effects of the change.

  The new rules will cause existing code to break if it defines nested
  functions where an outer function has local variables with the same
  name as globals or builtins used by the inner function.  Example:

    def munge(str):
        def helper(x):
            return str(x)
        if type(str) != type(''):
            str = helper(str)
        return str.strip()

  Under the old rules, the name str in helper() is bound to the
  builtin function str().  Under the new rules, it will be bound to
  the argument named str and an error will occur when helper() is
  called.

- The compiler will report a SyntaxError if "from ... import *" occurs
  in a function or class scope.  The language reference has documented
  that this case is illegal, but the compiler never checked for it.
  The recent introduction of nested scope makes the meaning of this
  form of name binding ambiguous.  In a future release, the compiler
  may allow this form when there is no possibility of ambiguity.

- repr(string) is easier to read, now using hex escapes instead of octal,
  and using \t, \n and \r instead of \011, \012 and \015 (respectively):

  >>> "\texample \r\n" + chr(0) + chr(255)
  '\texample \r\n\x00\xff'         # in 2.1
  '\011example \015\012\000\377'   # in 2.0

- Functions are now compared and hashed by identity, not by value, since
  the func_code attribute is writable.

- Weak references (PEP 205) have been added.  This involves a few
  changes in the core, an extension module (_weakref), and a Python
  module (weakref).  The weakref module is the public interface.  It
  includes support for "explicit" weak references, proxy objects, and
  mappings with weakly held values.

- A 'continue' statement can now appear in a try block within the body
  of a loop.  It is still not possible to use continue in a finally
  clause.

Standard library

- mailbox.py now has a new class, PortableUnixMailbox which is
  identical to UnixMailbox but uses a more portable scheme for
  determining From_ separators.  Also, the constructors for all the
  classes in this module have a new optional `factory' argument, which
  is a callable used when new message classes must be instantiated by
  the next() method.

- random.py is now self-contained, and offers all the functionality of
  the now-deprecated whrandom.py.  See the docs for details.  random.py
  also supports new functions getstate() and setstate(), for saving
  and restoring the internal state of the generator; and jumpahead(n),
  for quickly forcing the internal state to be the same as if n calls to
  random() had been made.  The latter is particularly useful for multi-
  threaded programs, creating one instance of the random.Random() class for
  each thread, then using .jumpahead() to force each instance to use a
  non-overlapping segment of the full period.

- random.py's seed() function is new.  For bit-for-bit compatibility with
  prior releases, use the whseed function instead.  The new seed function
  addresses two problems:  (1) The old function couldn't produce more than
  about 2**24 distinct internal states; the new one about 2**45 (the best
  that can be done in the Wichmann-Hill generator).  (2) The old function
  sometimes produced identical internal states when passed distinct
  integers, and there was no simple way to predict when that would happen;
  the new one guarantees to produce distinct internal states for all
  arguments in [0, 27814431486576L).

- The socket module now supports raw packets on Linux.  The socket
  family is AF_PACKET.

- test_capi.py is a start at running tests of the Python C API.  The tests
  are implemented by the new Modules/_testmodule.c.

- A new extension module, _symtable, provides provisional access to the
  internal symbol table used by the Python compiler.  A higher-level
  interface will be added on top of _symtable in a future release.

Windows changes

- Build procedure:  the zlib project is built in a different way that
  ensures the zlib header files used can no longer get out of synch with
  the zlib binary used.  See PCbuild\readme.txt for details.  Your old
  zlib-related directories can be deleted; you'll need to download fresh
  source for zlib and unpack it into a new directory.

- Build:  New subproject _test for the benefit of test_capi.py (see above).

- Build:  subproject ucnhash is gone, since the code was folded into the
  unicodedata subproject.

What's New in Python 2.1 alpha 1?
=================================

Core language, builtins, and interpreter

- There is a new Unicode companion to the PyObject_Str() API
  called PyObject_Unicode(). It behaves in the same way as the
  former, but assures that the returned value is an Unicode object
  (applying the usual coercion if necessary).

- The comparison operators support "rich comparison overloading" (PEP
  207).  C extension types can provide a rich comparison function in
  the new tp_richcompare slot in the type object.  The cmp() function
  and the C function PyObject_Compare() first try the new rich
  comparison operators before trying the old 3-way comparison.  There
  is also a new C API PyObject_RichCompare() (which also falls back on
  the old 3-way comparison, but does not constrain the outcome of the
  rich comparison to a Boolean result).

  The rich comparison function takes two objects (at least one of
  which is guaranteed to have the type that provided the function) and
  an integer indicating the opcode, which can be Py_LT, Py_LE, Py_EQ,
  Py_NE, Py_GT, Py_GE (for <, <=, ==, !=, >, >=), and returns a Python
  object, which may be NotImplemented (in which case the tp_compare
  slot function is used as a fallback, if defined).

  Classes can overload individual comparison operators by defining one
  or more of the methods__lt__, __le__, __eq__, __ne__, __gt__,
  __ge__.  There are no explicit "reflected argument" versions of
  these; instead, __lt__ and __gt__ are each other's reflection,
  likewise for__le__ and __ge__; __eq__ and __ne__ are their own
  reflection (similar at the C level).  No other implications are
  made; in particular, Python does not assume that == is the Boolean
  inverse of !=, or that < is the Boolean inverse of >=.  This makes
  it possible to define types with partial orderings.

  Classes or types that want to implement (in)equality tests but not
  the ordering operators (i.e. unordered types) should implement ==
  and !=, and raise an error for the ordering operators.

  It is possible to define types whose rich comparison results are not
  Boolean; e.g. a matrix type might want to return a matrix of bits
  for A < B, giving elementwise comparisons.  Such types should ensure
  that any interpretation of their value in a Boolean context raises
  an exception, e.g. by defining __nonzero__ (or the tp_nonzero slot
  at the C level) to always raise an exception.

- Complex numbers use rich comparisons to define == and != but raise
  an exception for <, <=, > and >=.  Unfortunately, this also means
  that cmp() of two complex numbers raises an exception when the two
  numbers differ.  Since it is not mathematically meaningful to compare
  complex numbers except for equality, I hope that this doesn't break
  too much code.

- Functions and methods now support getting and setting arbitrarily
  named attributes (PEP 232).  Functions have a new __dict__
  (a.k.a. func_dict) which hold the function attributes.  Methods get
  and set attributes on their underlying im_func.  It is a TypeError
  to set an attribute on a bound method.

- The xrange() object implementation has been improved so that
  xrange(sys.maxint) can be used on 64-bit platforms.  There's still a
  limitation that in this case len(xrange(sys.maxint)) can't be
  calculated, but the common idiom "for i in xrange(sys.maxint)" will
  work fine as long as the index i doesn't actually reach 2**31.
  (Python uses regular ints for sequence and string indices; fixing
  that is much more work.)

- Two changes to from...import:

  1) "from M import X" now works even if M is not a real module; it's
     basically a getattr() operation with AttributeError exceptions
     changed into ImportError.

  2) "from M import *" now looks for M.__all__ to decide which names to
     import; if M.__all__ doesn't exist, it uses M.__dict__.keys() but
     filters out names starting with '_' as before.  Whether or not
     __all__ exists, there's no restriction on the type of M.

- File objects have a new method, xreadlines().  This is the fastest
  way to iterate over all lines in a file:

  for line in file.xreadlines():
      ...do something to line...

  See the xreadlines module (mentioned below) for how to do this for
  other file-like objects.

- Even if you don't use file.xreadlines(), you may expect a speedup on
  line-by-line input.  The file.readline() method has been optimized
  quite a bit in platform-specific ways:  on systems (like Linux) that
  support flockfile(), getc_unlocked(), and funlockfile(), those are
  used by default.  On systems (like Windows) without getc_unlocked(),
  a complicated (but still thread-safe) method using fgets() is used by
  default.

  You can force use of the fgets() method by #define'ing
  USE_FGETS_IN_GETLINE at build time (it may be faster than
  getc_unlocked()).

  You can force fgets() not to be used by #define'ing
  DONT_USE_FGETS_IN_GETLINE (this is the first thing to try if std test
  test_bufio.py fails -- and let us know if it does!).

- In addition, the fileinput module, while still slower than the other
  methods on most platforms, has been sped up too, by using
  file.readlines(sizehint).

- Support for run-time warnings has been added, including a new
  command line option (-W) to specify the disposition of warnings.
  See the description of the warnings module below.

- Extensive changes have been made to the coercion code.  This mostly
  affects extension modules (which can now implement mixed-type
  numerical operators without having to use coercion), but
  occasionally, in boundary cases the coercion semantics have changed
  subtly.  Since this was a terrible gray area of the language, this
  is considered an improvement.  Also note that __rcmp__ is no longer
  supported -- instead of calling __rcmp__, __cmp__ is called with
  reflected arguments.

- In connection with the coercion changes, a new built-in singleton
  object, NotImplemented is defined.  This can be returned for
  operations that wish to indicate they are not implemented for a
  particular combination of arguments.  From C, this is
  Py_NotImplemented.

- The interpreter accepts now bytecode files on the command line even
  if they do not have a .pyc or .pyo extension. On Linux, after executing

  echo ':pyc:M::\x87\xc6\x0d\x0a::/usr/local/bin/python:' > /proc/sys/fs/binfmt_misc/register

  any byte code file can be used as an executable (i.e. as an argument
  to execve(2)).

- %[xXo] formats of negative Python longs now produce a sign
  character.  In 1.6 and earlier, they never produced a sign,
  and raised an error if the value of the long was too large
  to fit in a Python int.  In 2.0, they produced a sign if and
  only if too large to fit in an int.  This was inconsistent
  across platforms (because the size of an int varies across
  platforms), and inconsistent with hex() and oct().  Example:

  >>> "%x" % -0x42L
  '-42'      # in 2.1
  'ffffffbe' # in 2.0 and before, on 32-bit machines
  >>> hex(-0x42L)
  '-0x42L'   # in all versions of Python

  The behavior of %d formats for negative Python longs remains
  the same as in 2.0 (although in 1.6 and before, they raised
  an error if the long didn't fit in a Python int).

  %u formats don't make sense for Python longs, but are allowed
  and treated the same as %d in 2.1.  In 2.0, a negative long
  formatted via %u produced a sign if and only if too large to
  fit in an int.  In 1.6 and earlier, a negative long formatted
  via %u raised an error if it was too big to fit in an int.

- Dictionary objects have an odd new method, popitem().  This removes
  an arbitrary item from the dictionary and returns it (in the form of
  a (key, value) pair).  This can be useful for algorithms that use a
  dictionary as a bag of "to do" items and repeatedly need to pick one
  item.  Such algorithms normally end up running in quadratic time;
  using popitem() they can usually be made to run in linear time.

Standard library

- In the time module, the time argument to the functions strftime,
  localtime, gmtime, asctime and ctime is now optional, defaulting to
  the current time (in the local timezone).

- The ftplib module now defaults to passive mode, which is deemed a
  more useful default given that clients are often inside firewalls
  these days.  Note that this could break if ftplib is used to connect
  to a *server* that is inside a firewall, from outside; this is
  expected to be a very rare situation.  To fix that, you can call
  ftp.set_pasv(0).

- The module site now treats .pth files not only for path configuration,
  but also supports extensions to the initialization code: Lines starting
  with import are executed.

- There's a new module, warnings, which implements a mechanism for
  issuing and filtering warnings.  There are some new built-in
  exceptions that serve as warning categories, and a new command line
  option, -W, to control warnings (e.g. -Wi ignores all warnings, -We
  turns warnings into errors).  warnings.warn(message[, category])
  issues a warning message; this can also be called from C as
  PyErr_Warn(category, message).

- A new module xreadlines was added.  This exports a single factory
  function, xreadlines().  The intention is that this code is the
  absolutely fastest way to iterate over all lines in an open
  file(-like) object:

  import xreadlines
  for line in xreadlines.xreadlines(file):
      ...do something to line...

  This is equivalent to the previous the speed record holder using
  file.readlines(sizehint).  Note that if file is a real file object
  (as opposed to a file-like object), this is equivalent:

  for line in file.xreadlines():
      ...do something to line...

- The bisect module has new functions bisect_left, insort_left,
  bisect_right and insort_right.  The old names bisect and insort
  are now aliases for bisect_right and insort_right.  XXX_right
  and XXX_left methods differ in what happens when the new element
  compares equal to one or more elements already in the list:  the
  XXX_left methods insert to the left, the XXX_right methods to the
  right.  Code that doesn't care where equal elements end up should
  continue to use the old, short names ("bisect" and "insort").

- The new curses.panel module wraps the panel library that forms part
  of SYSV curses and ncurses.  Contributed by Thomas Gellekum.

- The SocketServer module now sets the allow_reuse_address flag by
  default in the TCPServer class.

- A new function, sys._getframe(), returns the stack frame pointer of
  the caller.  This is intended only as a building block for
  higher-level mechanisms such as string interpolation.

Build issues

- For Unix (and Unix-compatible) builds, configuration and building of
  extension modules is now greatly automated.  Rather than having to
  edit the Modules/Setup file to indicate which modules should be
  built and where their include files and libraries are, a
  distutils-based setup.py script now takes care of building most
  extension modules.  All extension modules built this way are built
  as shared libraries.  Only a few modules that must be linked
  statically are still listed in the Setup file; you won't need to
  edit their configuration.

- Python should now build out of the box on Cygwin.  If it doesn't,
  mail to Jason Tishler (jlt63 at users.sourceforge.net).

- Python now always uses its own (renamed) implementation of getopt()
  -- there's too much variation among C library getopt()
  implementations.

- C++ compilers are better supported; the CXX macro is always set to a
  C++ compiler if one is found.

Windows changes

- select module:  By default under Windows, a select() call
  can specify no more than 64 sockets.  Python now boosts
  this Microsoft default to 512.  If you need even more than
  that, see the MS docs (you'll need to #define FD_SETSIZE
  and recompile Python from source).

- Support for Windows 3.1, DOS and OS/2 is gone.  The Lib/dos-8x3
  subdirectory is no more!

-- Jeremy Hylton <http://www.python.org/~jeremy/>