[Python-checkins] r45736 - peps/trunk/pep-0000.txt peps/trunk/pep-3101.txt peps/trunk/pep-3102.txt
david.goodger
python-checkins at python.org
Wed Apr 26 22:33:26 CEST 2006
Author: david.goodger
Date: Wed Apr 26 22:33:25 2006
New Revision: 45736
Added:
peps/trunk/pep-3101.txt (contents, props changed)
peps/trunk/pep-3102.txt (contents, props changed)
Modified:
peps/trunk/pep-0000.txt
Log:
added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments
Modified: peps/trunk/pep-0000.txt
==============================================================================
--- peps/trunk/pep-0000.txt (original)
+++ peps/trunk/pep-0000.txt Wed Apr 26 22:33:25 2006
@@ -104,6 +104,8 @@
S 358 The "bytes" Object Schemenauer
S 359 The "make" Statement Bethard
S 754 IEEE 754 Floating Point Special Values Warnes
+ S 3101 Advanced String Formatting Talin
+ S 3102 Keyword-Only Arguments Talin
Finished PEPs (done, implemented in Subversion)
@@ -425,7 +427,8 @@
P 3002 Procedure for Backwards-Incompatible Changes Bethard
I 3099 Things that will Not Change in Python 3000 Brandl
I 3100 Python 3.0 Plans Kuchling, Cannon
-
+ S 3101 Advanced String Formatting Talin
+ S 3102 Keyword-Only Arguments Talin
Key
@@ -522,6 +525,7 @@
Smith, Kevin D. Kevin.Smith at theMorgue.org
Stein, Greg gstein at lyra.org
Suzi, Roman rnd at onego.ru
+ Talin talin at acm.org
Taschuk, Steven staschuk at telusplanet.net
Tirosh, Oren oren at hishome.net
Warnes, Gregory R. warnes at users.sourceforge.net
Added: peps/trunk/pep-3101.txt
==============================================================================
--- (empty file)
+++ peps/trunk/pep-3101.txt Wed Apr 26 22:33:25 2006
@@ -0,0 +1,346 @@
+PEP: 3101
+Title: Advanced String Formatting
+Version: $Revision$
+Last-Modified: $Date$
+Author: Talin <talin at acm.org>
+Status: Draft
+Type: Standards
+Content-Type: text/plain
+Created: 16-Apr-2006
+Python-Version: 3.0
+Post-History:
+
+
+Abstract
+
+ This PEP proposes a new system for built-in string formatting
+ operations, intended as a replacement for the existing '%' string
+ formatting operator.
+
+
+Rationale
+
+ Python currently provides two methods of string interpolation:
+
+ - The '%' operator for strings.
+
+ - The string.Template module.
+
+ The scope of this PEP will be restricted to proposals for built-in
+ string formatting operations (in other words, methods of the
+ built-in string type). This does not obviate the need for more
+ sophisticated string-manipulation modules in the standard library
+ such as string.Template. In any case, string.Template will not be
+ discussed here, except to say that the this proposal will most
+ likely have some overlapping functionality with that module.
+
+ The '%' operator is primarily limited by the fact that it is a
+ binary operator, and therefore can take at most two arguments.
+ One of those arguments is already dedicated to the format string,
+ leaving all other variables to be squeezed into the remaining
+ argument. The current practice is to use either a dictionary or a
+ tuple as the second argument, but as many people have commented
+ [1], this lacks flexibility. The "all or nothing" approach
+ (meaning that one must choose between only positional arguments,
+ or only named arguments) is felt to be overly constraining.
+
+
+Specification
+
+ The specification will consist of 4 parts:
+
+ - Specification of a set of methods to be added to the built-in
+ string class.
+
+ - Specification of a new syntax for format strings.
+
+ - Specification of a new set of class methods to control the
+ formatting and conversion of objects.
+
+ - Specification of an API for user-defined formatting classes.
+
+
+String Methods
+
+ The build-in string class will gain two new methods. The first
+ method is 'format', and takes an arbitrary number of positional
+ and keyword arguments:
+
+ "The story of {0}, {1}, and {c}".format(a, b, c=d)
+
+ Within a format string, each positional argument is identified
+ with a number, starting from zero, so in the above example, 'a' is
+ argument 0 and 'b' is argument 1. Each keyword argument is
+ identified by its keyword name, so in the above example, 'c' is
+ used to refer to the third argument.
+
+ The result of the format call is an object of the same type
+ (string or unicode) as the format string.
+
+
+Format Strings
+
+ Brace characters ('curly braces') are used to indicate a
+ replacement field within the string:
+
+ "My name is {0}".format('Fred')
+
+ The result of this is the string:
+
+ "My name is Fred"
+
+ Braces can be escaped using a backslash:
+
+ "My name is {0} :-\{\}".format('Fred')
+
+ Which would produce:
+
+ "My name is Fred :-{}"
+
+ The element within the braces is called a 'field'. Fields consist
+ of a name, which can either be simple or compound, and an optional
+ 'conversion specifier'.
+
+ Simple names are either names or numbers. If numbers, they must
+ be valid decimal numbers; if names, they must be valid Python
+ identifiers. A number is used to identify a positional argument,
+ while a name is used to identify a keyword argument.
+
+ Compound names are a sequence of simple names seperated by
+ periods:
+
+ "My name is {0.name} :-\{\}".format(dict(name='Fred'))
+
+ Compound names can be used to access specific dictionary entries,
+ array elements, or object attributes. In the above example, the
+ '{0.name}' field refers to the dictionary entry 'name' within
+ positional argument 0.
+
+ Each field can also specify an optional set of 'conversion
+ specifiers'. Conversion specifiers follow the field name, with a
+ colon (':') character separating the two:
+
+ "My name is {0:8}".format('Fred')
+
+ The meaning and syntax of the conversion specifiers depends on the
+ type of object that is being formatted, however many of the
+ built-in types will recognize a standard set of conversion
+ specifiers.
+
+ The conversion specifier consists of a sequence of zero or more
+ characters, each of which can consist of any printable character
+ except for a non-escaped '}'. The format() method does not
+ attempt to intepret the conversion specifiers in any way; it
+ merely passes all of the characters between the first colon ':'
+ and the matching right brace ('}') to the various underlying
+ formatters (described later.)
+
+ When using the 'fformat' variant, it is possible to omit the field
+ name entirely, and simply include the conversion specifiers:
+
+ "My name is {:pad(23)}"
+
+ This syntax is used to send special instructions to the custom
+ formatter object (such as instructing it to insert padding
+ characters up to a given column.) The interpretation of this
+ 'empty' field is entirely up to the custom formatter; no
+ standard interpretation will be defined in this PEP.
+
+ If a custom formatter is not being used, then it is an error to
+ omit the field name.
+
+
+Standard Conversion Specifiers
+
+ For most built-in types, the conversion specifiers will be the
+ same or similar to the existing conversion specifiers used with
+ the '%' operator. Thus, instead of '%02.2x", you will say
+ '{0:2.2x}'.
+
+ There are a few differences however:
+
+ - The trailing letter is optional - you don't need to say '2.2d',
+ you can instead just say '2.2'. If the letter is omitted, the
+ value will be converted into its 'natural' form (that is, the
+ form that it take if str() or unicode() were called on it)
+ subject to the field length and precision specifiers (if
+ supplied).
+
+ - Variable field width specifiers use a nested version of the {}
+ syntax, allowing the width specifier to be either a positional
+ or keyword argument:
+
+ "{0:{1}.{2}d}".format(a, b, c)
+
+ (Note: It might be easier to parse if these used a different
+ type of delimiter, such as parens - avoiding the need to create
+ a regex that handles the recursive case.)
+
+ - The support for length modifiers (which are ignored by Python
+ anyway) is dropped.
+
+ For non-built-in types, the conversion specifiers will be specific
+ to that type. An example is the 'datetime' class, whose
+ conversion specifiers are identical to the arguments to the
+ strftime() function:
+
+ "Today is: {0:%x}".format(datetime.now())
+
+
+Controlling Formatting
+
+ A class that wishes to implement a custom interpretation of its
+ conversion specifiers can implement a __format__ method:
+
+ class AST:
+ def __format__(self, specifiers):
+ ...
+
+ The 'specifiers' argument will be either a string object or a
+ unicode object, depending on the type of the original format
+ string. The __format__ method should test the type of the
+ specifiers parameter to determine whether to return a string or
+ unicode object. It is the responsibility of the __format__ method
+ to return an object of the proper type.
+
+ string.format() will format each field using the following steps:
+
+ 1) See if the value to be formatted has a __format__ method. If
+ it does, then call it.
+
+ 2) Otherwise, check the internal formatter within string.format
+ that contains knowledge of certain builtin types.
+
+ 3) Otherwise, call str() or unicode() as appropriate.
+
+
+User-Defined Formatting Classes
+
+ The code that interprets format strings can be called explicitly
+ from user code. This allows the creation of custom formatter
+ classes that can override the normal formatting rules.
+
+ The string and unicode classes will have a class method called
+ 'cformat' that does all the actual work of formatting; The
+ format() method is just a wrapper that calls cformat.
+
+ The parameters to the cformat function are:
+
+ -- The format string (or unicode; the same function handles
+ both.)
+ -- A field format hook (see below)
+ -- A tuple containing the positional arguments
+ -- A dict containing the keyword arguments
+
+ The cformat function will parse all of the fields in the format
+ string, and return a new string (or unicode) with all of the
+ fields replaced with their formatted values.
+
+ For each field, the cformat function will attempt to call the
+ field format hook with the following arguments:
+
+ field_hook(value, conversion, buffer)
+
+ The 'value' field corresponds to the value being formatted, which
+ was retrieved from the arguments using the field name. (The
+ field_hook has no control over the selection of values, only
+ how they are formatted.)
+
+ The 'conversion' argument is the conversion spec part of the
+ field, which will be either a string or unicode object, depending
+ on the type of the original format string.
+
+ The 'buffer' argument is a Python array object, either a byte
+ array or unicode character array. The buffer object will contain
+ the partially constructed string; the field hook is free to modify
+ the contents of this buffer if needed.
+
+ The field_hook will be called once per field. The field_hook may
+ take one of two actions:
+
+ 1) Return False, indicating that the field_hook will not
+ process this field and the default formatting should be
+ used. This decision should be based on the type of the
+ value object, and the contents of the conversion string.
+
+ 2) Append the formatted field to the buffer, and return True.
+
+
+Alternate Syntax
+
+ Naturally, one of the most contentious issues is the syntax of the
+ format strings, and in particular the markup conventions used to
+ indicate fields.
+
+ Rather than attempting to exhaustively list all of the various
+ proposals, I will cover the ones that are most widely used
+ already.
+
+ - Shell variable syntax: $name and $(name) (or in some variants,
+ ${name}). This is probably the oldest convention out there, and
+ is used by Perl and many others. When used without the braces,
+ the length of the variable is determined by lexically scanning
+ until an invalid character is found.
+
+ This scheme is generally used in cases where interpolation is
+ implicit - that is, in environments where any string can contain
+ interpolation variables, and no special subsitution function
+ need be invoked. In such cases, it is important to prevent the
+ interpolation behavior from occuring accidentally, so the '$'
+ (which is otherwise a relatively uncommonly-used character) is
+ used to signal when the behavior should occur.
+
+ It is the author's opinion, however, that in cases where the
+ formatting is explicitly invoked, that less care needs to be
+ taken to prevent accidental interpolation, in which case a
+ lighter and less unwieldy syntax can be used.
+
+ - Printf and its cousins ('%'), including variations that add a
+ field index, so that fields can be interpolated out of order.
+
+ - Other bracket-only variations. Various MUDs (Multi-User
+ Dungeons) such as MUSH have used brackets (e.g. [name]) to do
+ string interpolation. The Microsoft .Net libraries uses braces
+ ({}), and a syntax which is very similar to the one in this
+ proposal, although the syntax for conversion specifiers is quite
+ different. [2]
+
+ - Backquoting. This method has the benefit of minimal syntactical
+ clutter, however it lacks many of the benefits of a function
+ call syntax (such as complex expression arguments, custom
+ formatters, etc.).
+
+ - Other variations include Ruby's #{}, PHP's {$name}, and so
+ on.
+
+
+Backwards Compatibility
+
+ Backwards compatibility can be maintained by leaving the existing
+ mechanisms in place. The new system does not collide with any of
+ the method names of the existing string formatting techniques, so
+ both systems can co-exist until it comes time to deprecate the
+ older system.
+
+
+References
+
+ [1] [Python-3000] String formating operations in python 3k
+ http://mail.python.org/pipermail/python-3000/2006-April/000285.html
+
+ [2] Composite Formatting - [.Net Framework Developer's Guide]
+ http://msdn.microsoft.com/library/en-us/cpguide/html/cpconcompositeformatting.asp?frame=true
+
+
+Copyright
+
+ This document has been placed in the public domain.
+
+
+Local Variables:
+mode: indented-text
+indent-tabs-mode: nil
+sentence-end-double-space: t
+fill-column: 70
+coding: utf-8
+End:
Added: peps/trunk/pep-3102.txt
==============================================================================
--- (empty file)
+++ peps/trunk/pep-3102.txt Wed Apr 26 22:33:25 2006
@@ -0,0 +1,184 @@
+PEP: 3102
+Title: Keyword-Only Arguments
+Version: $Revision$
+Last-Modified: $Date$
+Author: Talin <talin at acm.org>
+Status: Draft
+Type: Standards
+Content-Type: text/plain
+Created: 22-Apr-2006
+Python-Version: 3.0
+Post-History:
+
+
+Abstract
+
+ This PEP proposes a change to the way that function arguments are
+ assigned to named parameter slots. In particular, it enables the
+ declaration of "keyword-only" arguments: arguments that can only
+ be supplied by keyword and which will never be automatically
+ filled in by a positional argument.
+
+
+Rationale
+
+ The current Python function-calling paradigm allows arguments to
+ be specified either by position or by keyword. An argument can be
+ filled in either explicitly by name, or implicitly by position.
+
+ There are often cases where it is desirable for a function to take
+ a variable number of arguments. The Python language supports this
+ using the 'varargs' syntax ('*name'), which specifies that any
+ 'left over' arguments be passed into the varargs parameter as a
+ tuple.
+
+ One limitation on this is that currently, all of the regular
+ argument slots must be filled before the vararg slot can be.
+
+ This is not always desirable. One can easily envision a function
+ which takes a variable number of arguments, but also takes one
+ or more 'options' in the form of keyword arguments. Currently,
+ the only way to do this is to define both a varargs argument,
+ and a 'keywords' argument (**kwargs), and then manually extract
+ the desired keywords from the dictionary.
+
+
+Specification
+
+ Syntactically, the proposed changes are fairly simple. The first
+ change is to allow regular arguments to appear after a varargs
+ argument:
+
+ def sortwords(*wordlist, case_sensitive=False):
+ ...
+
+ This function accepts any number of positional arguments, and it
+ also accepts a keyword option called 'case_sensitive'. This
+ option will never be filled in by a positional argument, but
+ must be explicitly specified by name.
+
+ Keyword-only arguments are not required to have a default value.
+ Since Python requires that all arguments be bound to a value,
+ and since the only way to bind a value to a keyword-only argument
+ is via keyword, such arguments are therefore 'required keyword'
+ arguments. Such arguments must be supplied by the caller, and
+ they must be supplied via keyword.
+
+ The second syntactical change is to allow the argument name to
+ be omitted for a varargs argument:
+
+ def compare(a, b, *, key=None):
+ ...
+
+ The reasoning behind this change is as follows. Imagine for a
+ moment a function which takes several positional arguments, as
+ well as a keyword argument:
+
+ def compare(a, b, key=None):
+ ...
+
+ Now, suppose you wanted to have 'key' be a keyword-only argument.
+ Under the above syntax, you could accomplish this by adding a
+ varargs argument immediately before the keyword argument:
+
+ def compare(a, b, *ignore, key=None):
+ ...
+
+ Unfortunately, the 'ignore' argument will also suck up any
+ erroneous positional arguments that may have been supplied by the
+ caller. Given that we'd prefer any unwanted arguments to raise an
+ error, we could do this:
+
+ def compare(a, b, *ignore, key=None):
+ if ignore: # If ignore is not empty
+ raise TypeError
+
+ As a convenient shortcut, we can simply omit the 'ignore' name,
+ meaning 'don't allow any positional arguments beyond this point'.
+
+
+Function Calling Behavior
+
+ The previous section describes the difference between the old
+ behavior and the new. However, it is also useful to have a
+ description of the new behavior that stands by itself, without
+ reference to the previous model. So this next section will
+ attempt to provide such a description.
+
+ When a function is called, the input arguments are assigned to
+ formal parameters as follows:
+
+ - For each formal parameter, there is a slot which will be used
+ to contain the value of the argument assigned to that
+ parameter.
+
+ - Slots which have had values assigned to them are marked as
+ 'filled'. Slots which have no value assigned to them yet are
+ considered 'empty'.
+
+ - Initially, all slots are marked as empty.
+
+ - Positional arguments are assigned first, followed by keyword
+ arguments.
+
+ - For each positional argument:
+
+ o Attempt to bind the argument to the first unfilled
+ parameter slot. If the slot is not a vararg slot, then
+ mark the slot as 'filled'.
+
+ o If the next unfilled slot is a vararg slot, and it does
+ not have a name, then it is an error.
+
+ o Otherwise, if the next unfilled slot is a vararg slot then
+ all remaining non-keyword arguments are placed into the
+ vararg slot.
+
+ - For each keyword argument:
+
+ o If there is a parameter with the same name as the keyword,
+ then the argument value is assigned to that parameter slot.
+ However, if the parameter slot is already filled, then that
+ is an error.
+
+ o Otherwise, if there is a 'keyword dictionary' argument,
+ the argument is added to the dictionary using the keyword
+ name as the dictionary key, unless there is already an
+ entry with that key, in which case it is an error.
+
+ o Otherwise, if there is no keyword dictionary, and no
+ matching named parameter, then it is an error.
+
+ - Finally:
+
+ o If the vararg slot is not yet filled, assign an empty tuple
+ as its value.
+
+ o For each remaining empty slot: if there is a default value
+ for that slot, then fill the slot with the default value.
+ If there is no default value, then it is an error.
+
+ In accordance with the current Python implementation, any errors
+ encountered will be signaled by raising TypeError. (If you want
+ something different, that's a subject for a different PEP.)
+
+
+Backwards Compatibility
+
+ The function calling behavior specified in this PEP is a superset
+ of the existing behavior - that is, it is expected that any
+ existing programs will continue to work.
+
+
+Copyright
+
+ This document has been placed in the public domain.
+
+
+Local Variables:
+mode: indented-text
+indent-tabs-mode: nil
+sentence-end-double-space: t
+fill-column: 70
+coding: utf-8
+End:
More information about the Python-checkins
mailing list