[Python-Dev] Splitting the PEP for adding a decimal type to Python

Michael McLay mclay@nist.gov
Fri, 27 Jul 2001 15:51:38 -0400

On Friday 27 July 2001 12:35 pm, Guido van Rossum wrote:
> [me]
> > I wasn't suggesting creating a separate interpreter, I was
> > suggesting adding a simple mechanism for allowing a new dialect of
> > Python to be added to the existing interpreter.
> Understood.  I see no big difference in having two binaries or one
> binary with a command line option; the two binaries effectively
> contain the same functionality, just with a different default.  I
> would vote for one binary; if you really think it's too much for your
> users to say "python -d" instead of "dpython", give them a script.  (I
> know that the -d option currently means something else.  That's a
> detail to worry about later.)

I decided to use a symbolic link to a different command name to set the 
default encoding of numerical literals. I did this because refer to the 
'dpython' command more concise than  "python -d".  The executable could also 
have command options to select between python and dpython modes.

> I'm not very fond of having multiple dialects.  There are lots of
> contexts where the dialect in use is not explicitly mentioned
> (e.g. when people discuss fragments of Python code).

I'm not fond of dialects when they don't serve a significant purpose.  
However, I believe it would be useful to at least discuss creating a special 
purpose "safe" mode for the Python lexer.  This mode would be attractive to 
newbies and financial programmers.  Calling this a new dialect is an 
overstatement.  It is more like defining a subset of the language that uses a 
special vocabulary for working with decimal types.

> > Another would be to use Unicode as the default character set.  This
> > would allow Unicode characters to be in strings without needing to
> > escape them.
> That's not a dialect, that's a different input encoding.  MAL already
> has a PEP for that.

I know about the PEP.  I was refering to making it the default string type 
for a '.dp' file.  There would be no prefix 'u' required.  

I'll remove this and the other unrelated items from the decimal type PEP

If you don't agree with the idea of adding dpython lexer mode then there is 
no point in discussing the features that would be in that mode.

> > The idea of adding a new language on top of the existing
> > infrastructure isn't that unusual. The gcc compiler can process many
> > languages to produce a common machine dependant object code.  I can
> > envision taking my simple changes a few steps further and turning
> > the entire tokenizer into a replaceable unit.  This approach would
> > allows projects to build other languages on top of the Python byte
> > code interpreter.  Imagine having Javascript, VBasic, or sh
> > tokenizer frontends generating Python bytecodes.  Think of it as the
> > pyNET architecture:-) This change probably belongs in Python4k.
> Or in Python .NET.  Decoupling the various part of the parse+compile
> pipeline is something I've considered.

Did you decide against it, or has it just not been a high enough priority?

> But again this has nothing to do with decimal numbers: your proposal
> allows the mixing of decimal and binary numbers (as long as one of
> them uses an explicit base indicator) so you don't really need two
> parsers -- you need one tokenizer plus a way to specify the default
> numeric base for literals.

That is exactly what I implemented.  The dpython command and the '.dp' cause 
the Py_USE_DECIMAL_AS_DEFAULT[1] flag to be set.  When this flag is set 
decimal numbers are used for literals.  

> I'll have to go back to your defense of the two dialect approach, but
> I think it's neither sufficient nor necessary.

I have mixed too many ideas into a PEP.  I'll rework the PEP to remove the 
cruft and focus on the addition of decimal numbers.  I move the other ideas 
into a separate PEP.

> Well, sometimes more generality than you need hurts.  I'm not
> convinced that we need an open-ended set of numeric literals.  But in
> the light of the unified numeric model, we may need ways to make
> exactness or inexactness explicit, and/or we may need a way to specify
> rational numbers.  If we can fit all of these in the
> number-with-letter-suffix mold, that would be nice for the lexer, I
> suppose.

I worry about a "unified numerical model" getting overly complex.  I think 
decimal numbers help because they are a better choice than binary numbers for 
a significant percentage of all software applications.  I know that rationale 
numbers are imporant in some applications.  Am I overlooking some huge class 
of applications that use rationales?  While Tim and some of the other 
Pythoneers can probably think of dozens of specialized numerical types, I 
would venture to guess that binary types and a decimal type probably cover 
90% of all the user's requirements. 

[1] I'll be renaming the flat to this in the next version.  The flag is 
currently called Py_NEW_PARSER.  I named it that because at one time I was 
creating a new parser.  I trimmed the changes down to just a few edits of the 
tokenizer and compile.c