Future division patch available (PEP 238)

Christian Tanzer tanzer at swing.co.at
Wed Jul 25 04:37:04 EDT 2001


Guido,

I deeply respect your language design skills and I appreciate your
ongoing improvements. I'm impatiently looking forward to using many of
the features new in 2.2.

The future division patch is a different issue, though. While I agree
that the proposed changes are a definite improvement over the current
semantics, the issue of backwards compatibility is a huge problem
difficult to solve. I see the following problems:

- What's going to happen to code released into the wild (i.e., I can
  change all my code but what about code I gave to others)? 

- How can one write readable code working correctly in both old and
  new Python versions?

- It takes a potentially huge effort to change all the existing code.

- In some cases, warnings might not be seen (e.g., they might land in
  /dev/null or in some log files nobody looks at).

- If an application jumps versions (e.g., from 2.1 to 2.6), no warnings
  might be generated at all.

- Upgrading to a new version of an application might break user scripts or
  databases.

I have no idea how to tackle all these issues but I'll offer some
ideas nevertheless.

- If `int` always truncated (instead of truncating or rounding,
  depending on how the C compiler does it), one could write reasonably
  readable version independent code for truncating integer division.

  Compare `int (a/b)` to `divmod (a, b) [0]` or
  `int (math.floor (a/b))`.

- Just a wild idea: the problem you want to solve is that the existing
  division operator mixes two totally different meanings and thus
  leads to nasty surprises.

  What if `/` applied to two integer values returned neither an
  integer nor a float but an object carrying the float result but
  behaving like an integer if used in an integer context?

  For instance:

      >>> x = 1/2
      >>> type(x)
      <type 'Ratio'>
      >>> print "%d %f %s" % (x, x, x)
      0 0.5 0.5
      >>> 2 * x
      0
      >>> 2. * x
      1.0

   The difficult issue here is how `integer context` is defined.
   Should multiplication by an integer be considered an integer
   context? Pro: would preserve correctness of existing code like
   `(size / 8) * 8`. Con: is incompatible with Rationals which might
   be added in the future.

- Command line options are not a good way of handling this -- in many
  cases, different modules might need different settings. Even worse,
  looking at the code of a module won't tell you what option to use.

- If there is a possibility of specifying division semantics on a
  per module case (via a directive or the file extension), it should
  also be possible to specify the semantics for thingies like
  `compile`, `execfile`, `exec`, and `eval`.

  This only works if absence of a semantics indicator means old-style
  division. 

  I think this would go the farthest to alleviate compatibility
  problems. I understand your desire to avoid dragging the past around
  with you wherever you go and I like Python for its cleanliness. But
  in this case, it might be worthwhile to carry the ballast.

Let me outline the problems faced by my current customer TTTech (I'm
working as consultant for them). [This is going to be long -- sorry.]

TTTech provides design and on-line tools for embedded distributed
real-time systems in safety critical application domains (e.g.,
aerospace, automotive, ...). TTTech sells software tools (programmed
in Python) to customers worldwide.

Currently, there is a major release once a year. Due to various
reasons, the shipped tools normally don't use the most recent version
of Python. The current release is still based on 1.5.2. We hope to use
Python 2.1 for the release planned for the end of the year.
Internally, we try to use the most recent Python version. Therefore,
our Python code must be compatible to several Python versions.

The division change effects:

- Python programs
- Python scripts
- user scripts
- design databases

Python programs
---------------

I just used Skip's div-finder (thanks, Skip) to check the code of
three of our applications. It finds 391 uses of division. I know that
many of those are meant to be truncating divisions, while many others
are meant to be floating divisions. Somebody will have to look at each
and every one and fix it -- automatic conversion won't be possible.

Unfortunately, the applications also contain lots of code inside of
strings feed to eval or exec during run-time. I don't know
how many divisions are in those, but somebody will have to look at
them as well. This is one area frequently overlooked when the effect
of changes is discussed and conversion tools proposed on c.l.py.

As these tools are frozen, they don't depend on what Python version
the user has installed.

Python scripts
--------------

Internally, TTTech uses quite a number of Python scripts.
Unfortunately, different users have different Python versions
installed. Currently, 1.5.2, 2.0, and 2.1 are installed (there might
still be the odd 1.5.1 around somewhere, too). As the scripts are
taken from a central file server whereas Python is installed locally,
the scripts and the library modules used must be compatible to all the
Python versions deployed. That makes migration difficult if the same
symbol means crossly different things in different versions.

User scripts
------------

Our tools are user scriptable. These scripts are written in Python and
executed in the application's context via execfile (they don't work as
standalone scripts).

Such scripts are written and maintained by unknown customers who may
or may not be programmers and who may or may not have Python
experience. (One of the nice features of Python is that even a
computer naive user can start writing scripts with little knowledge
about Python by modifying examples). Quite often, important scripts 
have been implemented by people who since changed jobs.

Various customers use such scripts for interfacing to other tools,
creating designs, checking designs for conformance, writing test
cases, implementing test frameworks, generating design reports, ...

The delivery of a new tool version ***must not break*** such scripts.
TTTech simply cannot tell their customers that they have to review all
their scripts and change some but not all occurrences of the division
operator. We don't want to get stuck with an outdated Python version,
either. 

OTOH, we cannot assume old style semantics in the scripts either as
new users might never have heard about how division used to work in
warty versions of Python.

Design databases
-----------------

Design databases store the design of an embedded distributed
real-time system as specified by the user. Such databases must stay
alive for a looooong time (think of 10+ years for some application
domains). 

Our tools allow the specification of symbolic expressions by the user.
Such expressions are feed through eval at the right time (i.e., late)
to get a numeric value. The symbolic expressions are stored in the
database as entered by the user. Reading an old database with a new
tool version ***must not change*** the semantics.

To be honest, for TTTech design databases the change in division
probably doesn't pose any problems. Due to user demand, the tools
coerced divisions to floating point for a long time. Other companies
might be bitten in this way, though.

-- 
Christian Tanzer                                         tanzer at swing.co.at
Glasauergasse 32                                       Tel: +43 1 876 62 36
A-1130 Vienna, Austria                                 Fax: +43 1 877 66 92





More information about the Python-list mailing list