[Python-checkins] CVS: python/nondist/peps pep-0218.txt,1.4,1.5
Greg Wilson
gvwilson@users.sourceforge.net
Mon, 07 May 2001 12:51:12 -0700
Update of /cvsroot/python/python/nondist/peps
In directory usw-pr-cvs1:/tmp/cvs-serv28802
Modified Files:
pep-0218.txt
Log Message:
Updating PEP to reflect prototype implementation
Index: pep-0218.txt
===================================================================
RCS file: /cvsroot/python/python/nondist/peps/pep-0218.txt,v
retrieving revision 1.4
retrieving revision 1.5
diff -C2 -r1.4 -r1.5
*** pep-0218.txt 2000/12/14 17:11:17 1.4
--- pep-0218.txt 2001/05/07 19:51:10 1.5
***************
*** 2,9 ****
Title: Adding a Built-In Set Object Type
Version: $Revision$
! Author: gvwilson@nevex.com (Greg Wilson)
Status: Draft
Type: Standards Track
! Python-Version: 2.1
Created: 31-Jul-2000
Post-History:
--- 2,9 ----
Title: Adding a Built-In Set Object Type
Version: $Revision$
! Author: gvwilson@ddj.com (Greg Wilson)
Status: Draft
Type: Standards Track
! Python-Version: 2.2
Created: 31-Jul-2000
Post-History:
***************
*** 12,26 ****
Introduction
! This PEP proposes adding sets as a built-in type in Python.
Rationale
- One of Python's greatest strengths as a teaching language is its
- clarity. Its syntax and object model are so clean, and so simple,
- that it can serve as "executable pseudocode". Anything that makes
- it even better suited for this role will help increase its use in
- school and college courses.
-
Sets are a fundamental mathematical structure, and are very
commonly used in algorithm specifications. They are much less
--- 12,29 ----
Introduction
! This PEP proposes adding a Set module to the standard Python
! library, and to then make sets a built-in Python type if that
! module is widely used. After explaining why sets are desirable,
! and why the common idiom of using dictionaries in their place is
! inadequate, we describe how we intend built-in sets to work, and
! then how the preliminary Set module will behave. The penultimate
! section discusses the mutability (or otherwise) of sets and set
! elements, and the solution which the Set module will implement.
! The last section then looks at alternatives that were considered,
! but discarded.
Rationale
Sets are a fundamental mathematical structure, and are very
commonly used in algorithm specifications. They are much less
***************
*** 43,59 ****
! Proposal
! We propose adding a set type to Python. This type will be an
! unordered collection of unique values, just as a dictionary is an
! unordered collection of key/value pairs. Constant sets will be
! represented using the usual mathematical notation, so that
! "{1, 2, 3}" will be a set of three integers.
! In order to avoid ambiguity, the empty set will be written "{,}",
rather than "{}" (which is already used to represent empty
dictionaries). We feel that this notation is as reasonable as the
use of "(3,)" to represent single-element tuples; a more radical
! strategy is discussed in the "Alternatives" section.
Iteration and comprehension will be implemented in the obvious
--- 46,64 ----
! Long-Term Proposal
! The long-term goal of this PEP is to add a built-in set type to
! Python. This type will be an unordered collection of unique
! values, just as a dictionary is an unordered collection of
! key/value pairs. Constant sets will be represented using the
! usual mathematical notation, so that "{1, 2, 3}" will be a set of
! three integers.
! In order to avoid ambiguity, the empty set will be written "{-}",
rather than "{}" (which is already used to represent empty
dictionaries). We feel that this notation is as reasonable as the
use of "(3,)" to represent single-element tuples; a more radical
! strategy is discussed in the "Alternatives" section, and more
! readable than the earlier proposal "{,}".
Iteration and comprehension will be implemented in the obvious
***************
*** 67,170 ****
will produce a set containing the squares of all elements in S,
!
! Membership will be tested using "in" and "not in".
!
! The binary operators '|', '&', '-', and "^" will implement set
! union, intersection, difference, and symmetric difference. Their
! in-place equivalents will have the obvious semantics. (We feel
! that it is more sensible to overload the bitwise operators '|' and
! '&', rather than the arithmetic operators '+' and "*', because
! there is no arithmetic equivalent of '^'.)
!
! The method "add" will add an element to a set. This is different
! from set union, as the following example shows:
!
! >>> {1, 2, 3} | {4, 5, 6}
! {1, 2, 3, 4, 5, 6}
!
! >>> {1, 2, 3}.add({4, 5, 6})
! {1, 2, 3, {4, 5, 6}}
!
! Note that we expect that items can also be added to sets using
! in-place union of temporaries, i.e. "S |= {x}" instead of
! "S.add(x)".
!
! Elements will be deleted from sets using a "remove" method, or
! using "del":
!
! >>> S = {1, 2, 3}
! >>> S.remove(3)
! >>> S
! {1, 2}
! >>> del S[1]
! >>> S
! {2}
!
! The "KeyError" exception will be raised if an attempt is made to
! remove an element which is not in a set. This definition of "del"
! is consistent with that used for dictionaries:
!
! >>> D = {1:2, 3:4}
! >>> del D[1]
! >>> D
! {3:4}
!
! A new method "dict.keyset" will return the keys of a dictionary as
! a set. A corresponding method "dict.valueset" will return the
! dictionary's values as a set.
!
! A built-in converter "set()" will convert any sequence type to a
! set; converters such as "list()" and "tuple()" will be extended to
! handle sets as input.
!
!
! Open Issues
!
! One major issue remains to be resolved: will sets be allowed to
! contain mutable values, or will their values be required to
! immutable (as dictionary keys are)? The disadvantages of allowing
! only immutable values are clear --- if nothing else, it would
! prevent users from creating sets of sets.
!
! However, no efficient implementation of sets of mutable values has
! yet been suggested. Hashing approaches will obviously fail (which
! is why mutable values are not allowed to be dictionary keys).
! Even simple-minded implementations, such as storing the set's
! values in a list, can give incorrect results, as the following
! example shows:
!
! >>> a = [1, 2]
! >>> b = [3, 4]
! >>> S = [a, b]
! >>> a[0:2] = [3, 4]
! >>> S
! [[3, 4], [3, 4]]
!
! One way to solve this problem would be to add observer/observable
! functionality to every data structure in Python, so that
! structures would know to update themselves when their contained
! values mutated. This is clearly impractical given the current
! code base, and the performance penalties (in both memory and
! execution time) would probably be unacceptable anyway.
Alternatives
! A more conservative alternative to this proposal would be to add a
! new built-in class "Set", rather than adding new syntax for direct
! expression of sets. On the positive side, this would not require
! any changes to the Python language definition. On the negative
! side, people would then not be able to write Python programs using
! the same notation as they would use on a whiteboard. We feel that
! the more Python supports standard pre-existing notation, the
! greater the chances of it being adopted as a teaching language.
!
! A radical alternative to the (admittedly clumsy) notation "{,}" is
! to re-define "{}" to be the empty collection, rather than the
! empty dictionary. Operations which made this object non-empty
! would silently convert it to either a dictionary or a set; it
! would then retain that type for the rest of its existence. This
! idea was rejected because of its potential impact on existing
! Python programs. A similar proposal to modify "dict.keys" and
"dict.values" to return sets, rather than lists, was rejected for
the same reasons.
--- 72,188 ----
will produce a set containing the squares of all elements in S,
! Membership will be tested using "in" and "not in", and basic set
! operations will be implemented by a mixture of overloaded
! operators:
!
! | union
! & intersection
! ^ symmetric difference
! - asymmetric difference
!
! and methods:
!
! S.add(x) Add "x" to the set.
!
! S.update(s) Add all elements of sequence "s" to the set.
!
! S.remove(x) Remove "x" from the set. If "x" is not
! present, this method raises a LookupError
! exception.
!
! S.discard(x) Remove "x" from the set if it is present, or
! do nothing if it is not.
!
! S.popitem() Remove and return an arbitrary element,
! raising a LookupError if the element is not
! present.
!
! and one new built-in conversion function:
!
! set(x) Create a set containing the elements of the
! collection "x".
!
! Notes:
!
! 1. We propose using the bitwise operators "|&" for intersection
! and union. While "+" for union would be intuitive, "*" for
! intersection is not (very few of the people asked guessed what
! it did correctly).
!
! 2. We considered using "+" to add elements to a set, rather than
! "add". However, Guido van Rossum pointed out that "+" is
! symmetric for other built-in types (although "*" is not). Use
! of "add" will also avoid confusion between that operation and
! set union.
!
! 3. Sets raise "LookupError" exceptions, rather than "KeyError" or
! "ValueError", because set elements are neither keys nor values.
!
! Short-Term Proposal
!
! In order to determine whether there is enough demand for sets to
! justify making them a built-in type, and to give users a chance to
! try out the semantics we propose for sets, our short-term proposal
! is to add a "Set" class to the standard Python library. This
! class will have the operators and methods described above; it will
! also have named methods corresponding to all of the operations: a
! "union" method for "|", and a "union_update" method for "|=", and
! so on.
!
! This class will use a dictionary internally to contain set values.
! In order to avoid having to duplicate values (e.g. for iteration
! through the set), the class will rely on the iterators which are
! scheduled to appear in Python 2.2.
!
! Tim Peters believes that the class's constructor should take a
! single sequence as an argument, and populate the set with that
! sequence's elements. His argument is that in most cases,
! programmers will be created sets from pre-existing sequences, so
! that common case should be usual. However, this would require
! users to remember an extra set of parentheses when initializing a
! set with known values:
!
! >>> Set((1, 2, 3, 4)) # case 1
!
! On the other hand, feedback from a small number of novice Python
! users (all of whom were very experienced with other languages)
! indicates that people will find a "parenthesis-free" syntax more
! natural:
!
! >>> Set(1, 2, 3, 4) # case 2
!
! On the other, other hand, if Python does adopt a dictionary-like
! notation for sets in the future, then case 2 will become
! redundant. We have therefore adopted the first strategy, in which
! the initializer takes a single sequence argument.
!
!
! Mutability
!
! The most difficult question to resolve in this proposal was
! whether sets ought to be able to contain mutable elements. A
! dictionary's keys must be immutable in order to support fast,
! reliable lookup. While it would be easy to require set elements
! to be immutable, this would preclude sets of sets (which are
! widely used in graph algorithms and other applications).
!
! At Tim Peters' suggestion, we will implement the following
! compromise. A set may only contain immutable elements, but is
! itself mutable *until* its hash code is calculated. As soon as
! that happens, the set is "frozen", i.e. becomes immutable. Thus,
! a set may be used as a dictionary key, or as a set element, but
! cannot be updated after this is done. Peters reports that this
! behavior rarely causes problems in practice.
Alternatives
! An alternative to the notation "{-}" for the empty set would be to
! re-define "{}" to be the empty collection, rather than the empty
! dictionary. Operations which made this object non-empty would
! silently convert it to either a dictionary or a set; it would then
! retain that type for the rest of its existence. This idea was
! rejected because of its potential impact on existing Python
! programs. A similar proposal to modify "dict.keys" and
"dict.values" to return sets, rather than lists, was rejected for
the same reasons.