[Python-checkins] CVS: python/nondist/peps pep-0218.txt,1.3,1.4
Barry Warsaw
python-dev@python.org
Thu, 14 Dec 2000 09:11:20 -0800
Update of /cvsroot/python/python/nondist/peps
In directory slayer.i.sourceforge.net:/tmp/cvs-serv19636
Modified Files:
pep-0218.txt
Log Message:
Greg Wilson's latest version
Index: pep-0218.txt
===================================================================
RCS file: /cvsroot/python/python/nondist/peps/pep-0218.txt,v
retrieving revision 1.3
retrieving revision 1.4
diff -C2 -r1.3 -r1.4
*** pep-0218.txt 2000/11/27 04:10:06 1.3
--- pep-0218.txt 2000/12/14 17:11:17 1.4
***************
*** 17,28 ****
Rationale
! Sets are a fundamental mathematical structure, and are commonly
! used to specify algorithms. They are much less frequently used in
! implementations, even when they are the "right" structure.
! Programmers frequently use lists instead, even when the ordering
! information in lists is irrelevant, and by-value lookups are
! frequent. (Most medium-sized C programs contain a depressing
! number of start-to-end searches through malloc'd vectors to
! determine whether particular items are present or not...)
Programmers are often told that they can implement sets as
--- 17,35 ----
Rationale
! One of Python's greatest strengths as a teaching language is its
! clarity. Its syntax and object model are so clean, and so simple,
! that it can serve as "executable pseudocode". Anything that makes
! it even better suited for this role will help increase its use in
! school and college courses.
!
! Sets are a fundamental mathematical structure, and are very
! commonly used in algorithm specifications. They are much less
! frequently used in implementations, even when they are the "right"
! structure. Programmers frequently use lists instead, even when
! the ordering information in lists is irrelevant, and by-value
! lookups are frequent. (Most medium-sized C programs contain a
! depressing number of start-to-end searches through malloc'd
! vectors to determine whether particular items are present or
! not...)
Programmers are often told that they can implement sets as
***************
*** 30,46 ****
these "sets" by assigning the "don't care" value to them;
membership can be tested using "dict.has_key"; and items can be
! deleted using "del". However, the three main binary operations
! on sets --- union, intersection, and difference --- are not
! directly supported by this representation, since their meaning is
! ambiguous for dictionaries containing key/value pairs.
Proposal
! We propose adding a new built-in type to Python to represent sets.
! This type will be an unordered collection of unique values, just
! as a dictionary is an unordered collection of key/value pairs.
! Constant sets will be represented using the usual mathematical
! notation, so that "{1, 2, 3}" will be a set of three integers.
In order to avoid ambiguity, the empty set will be written "{,}",
--- 37,53 ----
these "sets" by assigning the "don't care" value to them;
membership can be tested using "dict.has_key"; and items can be
! deleted using "del". However, the other main operations on sets
! (union, intersection, and difference) are not directly supported
! by this representation, since their meaning is ambiguous for
! dictionaries containing key/value pairs.
Proposal
! We propose adding a set type to Python. This type will be an
! unordered collection of unique values, just as a dictionary is an
! unordered collection of key/value pairs. Constant sets will be
! represented using the usual mathematical notation, so that
! "{1, 2, 3}" will be a set of three integers.
In order to avoid ambiguity, the empty set will be written "{,}",
***************
*** 48,52 ****
dictionaries). We feel that this notation is as reasonable as the
use of "(3,)" to represent single-element tuples; a more radical
! alternative is discussed in the "Alternatives" section.
Iteration and comprehension will be implemented in the obvious
--- 55,59 ----
dictionaries). We feel that this notation is as reasonable as the
use of "(3,)" to represent single-element tuples; a more radical
! strategy is discussed in the "Alternatives" section.
Iteration and comprehension will be implemented in the obvious
***************
*** 65,69 ****
The binary operators '|', '&', '-', and "^" will implement set
union, intersection, difference, and symmetric difference. Their
! in-place equivalents will have the obvious semantics.
The method "add" will add an element to a set. This is different
--- 72,79 ----
The binary operators '|', '&', '-', and "^" will implement set
union, intersection, difference, and symmetric difference. Their
! in-place equivalents will have the obvious semantics. (We feel
! that it is more sensible to overload the bitwise operators '|' and
! '&', rather than the arithmetic operators '+' and "*', because
! there is no arithmetic equivalent of '^'.)
The method "add" will add an element to a set. This is different
***************
*** 84,96 ****
>>> S = {1, 2, 3}
>>> del S[1]
>>> S
- {2, 3}
- >>> S.remove(3)
{2}
The "KeyError" exception will be raised if an attempt is made to
! remove an element which is not in a set.
A new method "dict.keyset" will return the keys of a dictionary as
a set. A corresponding method "dict.valueset" will return the
--- 94,113 ----
>>> S = {1, 2, 3}
+ >>> S.remove(3)
+ >>> S
+ {1, 2}
>>> del S[1]
>>> S
{2}
The "KeyError" exception will be raised if an attempt is made to
! remove an element which is not in a set. This definition of "del"
! is consistent with that used for dictionaries:
+ >>> D = {1:2, 3:4}
+ >>> del D[1]
+ >>> D
+ {3:4}
+
A new method "dict.keyset" will return the keys of a dictionary as
a set. A corresponding method "dict.valueset" will return the
***************
*** 102,106 ****
--- 119,162 ----
+ Open Issues
+
+ One major issue remains to be resolved: will sets be allowed to
+ contain mutable values, or will their values be required to
+ immutable (as dictionary keys are)? The disadvantages of allowing
+ only immutable values are clear --- if nothing else, it would
+ prevent users from creating sets of sets.
+
+ However, no efficient implementation of sets of mutable values has
+ yet been suggested. Hashing approaches will obviously fail (which
+ is why mutable values are not allowed to be dictionary keys).
+ Even simple-minded implementations, such as storing the set's
+ values in a list, can give incorrect results, as the following
+ example shows:
+
+ >>> a = [1, 2]
+ >>> b = [3, 4]
+ >>> S = [a, b]
+ >>> a[0:2] = [3, 4]
+ >>> S
+ [[3, 4], [3, 4]]
+
+ One way to solve this problem would be to add observer/observable
+ functionality to every data structure in Python, so that
+ structures would know to update themselves when their contained
+ values mutated. This is clearly impractical given the current
+ code base, and the performance penalties (in both memory and
+ execution time) would probably be unacceptable anyway.
+
+
Alternatives
+
+ A more conservative alternative to this proposal would be to add a
+ new built-in class "Set", rather than adding new syntax for direct
+ expression of sets. On the positive side, this would not require
+ any changes to the Python language definition. On the negative
+ side, people would then not be able to write Python programs using
+ the same notation as they would use on a whiteboard. We feel that
+ the more Python supports standard pre-existing notation, the
+ greater the chances of it being adopted as a teaching language.
A radical alternative to the (admittedly clumsy) notation "{,}" is