
Well, another proof of the bugs in the PEP process -- my remarks were lost, so I'll send them here. Let me note that almost everything Greg Wilson wants to do can be done via a Python class implementing a set using a dictionary mapping to None. Almost? * No builitin syntax: import Set;Set(1,2,3) instead of {1,2,3} * Convertors: if we want list/tuple to have a semblance of efficiency, we'll need to cache the element list as a list when accessed by index. * Two different constructors: set() for building from sequence, Set() for building from elements. Might be confusing. * Only possible elements of a set are immutables. OTOH, I'm not sure how Greg intends to implement his sets if these sets are allowed to contain mutable elements. -- Moshe Zadka <sig@zadka.site.co.il> This is a signature anti-virus. Please stop the spread of signature viruses!

Moshe Zadka <moshez@zadka.site.co.il>:
I agree. I wrote a full-featured set library long ago. All it's waiting for is rich comparisons. Wait...are those in 2.0? -- <a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a> It will be of little avail to the people, that the laws are made by men of their own choice, if the laws be so voluminous that they cannot be read, or so incoherent that they cannot be understood; if they be repealed or revised before they are promulgated, or undergo such incessant changes that no man, who knows what the law is to-day, can guess what it will be to-morrow. Law is defined to be a rule of action; but how can that be a rule, which is little known, and less fixed? -- James Madison, Federalist Papers 62

On Mon, 27 Nov 2000, Eric S. Raymond wrote:
I agree. I wrote a full-featured set library long ago. All it's waiting for is rich comparisons. Wait...are those in 2.0?
Nope. I'm not even sure if they're going to be in 2.1. DA? Anyway, can you at least give the URL? I wrote something like that too (is this another Python ritual), and I'd be happy to try and integrate them for 2.1. -- Moshe Zadka <moshez@math.huji.ac.il> -- 95855124 http://advogato.org/person/moshez

[Moshe, on rich comparisons]
Nope. I'm not even sure if they're going to be in 2.1. DA?
Yes, they should go into 2.1. Moshe, you are listed as co-author of PEP 207. What's up? Also, I thought you were going to co-author PEP 208 (Reworking the Coercion Model). What's up with that? Both are empty! --Guido van Rossum (home page: http://www.python.org/~guido/)

Hi, folks. Is there a standard or usual way to provide version information inside a Python module or class? Special tokens in the doc string, a method like "getVersion"...? Thanks, Greg

Greg Wilson writes:
Modules sometimes offer a "__version__" attribute, which I've also seen called "version" or "VERSION" in strange cases (xmllib and (I think) PIL). -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Digital Creations

Greg Wilson wrote:
AFAIK, __version__ with a string value is in common usage both in modules and classes. BTW, while we're at it: with the growing number of dependencies between modules, packages and the Python lib version... how about creating a standard to enable versioning in module/package imports ? It would have to meet (at least) these requirements: * import of a specific version * alias of the unversioned name to the most recent version available Thoughts ? -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

AFAIK, __version__ with a string value is in common usage both in modules and classes.
Correct. This was agreed upon as a standard long ago. It's probably not documented anywhere as such.
This is a major rathole. I don't know of any other language or system that has solved this in a satisfactory way. Most shared library implementations have had to deal with this, but it's generally a very painful manual process to ensure compatibility. I would rather stay out of trying to solve this for Python in a generic way -- when specific package develop incompatible versions, it's up to the package authors to come up with a backwards compatibility strategy. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
Probably should be... along with some other __special_name__ which are in use, e.g. __copyright__ or __author__. I think it would be a good idea to add these attribute names as virtually "reserved" attribute names.
Well, I think it should at least be possible to install multiple versions of the same packages in some way which makes it possible for other programs to choose which version to take. With packages and the default import mechanism this is possible using some caching and __path__ trickery -- I think Pmw does this. The new "from ... import ... as" also provides ways for this: from mx import DateTime160 as DateTime The problem with the latter approach is that pickles will use the versioned name and thus won't port easily to later versions. But you're probably right: only the package itself will know if it's ok to use the newer version or not. Would still be nice if we had some kind of documented naming scheme for versioned modules and packages though, so that the user will at least recognize these as being versioned import names. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

On 27 November 2000, Guido van Rossum said:
I think that the "Distributing Python Modules" manual should have a "Recommended Practices" section to cover things like this. My personal convention is that every single source file has a __revision__ variable, and the "root" module [1] of a module distribution has __version__. [1] the "root module" is the highest-level __init__.py in a package-ized distribution, or the module itself in a single-file distribution. Most non-packageized multi-module distributions have a fairly obvious place to put __version__. Greg

On Mon, 27 Nov 2000, Guido van Rossum wrote:
Ooops, the PEP-0207 thing is a mistake -- I meant to change 0208. No time now, I'll change 0208 back to DA later tonight. (Sorry for being so israelo-centric <wink>). The 0208 PEP should get done before the weekend. (Your weekend, not mine). -- Moshe Zadka <moshez@math.huji.ac.il> -- 95855124 http://advogato.org/person/moshez

Guido van Rossum <guido@python.org>:
Good. Within a few hours after I see that feature in the beta I'll have a very rich Set class for the standard library. It includes all the standard boolean operations, symmetric difference, powerset, and useful overloads for most of the standard operators. As the header comment says: # A set-algebra module for Python # # The functions work on any sequence type and return lists. # The set methods can take a set or any sequence type as an argument. Most of the code is tested already. All it's waiting on is rich comparisons so I can handle partial-ordering of sets properly. -- <a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a> When only cops have guns, it's called a "police state". -- Claire Wolfe, "101 Things To Do Until The Revolution"

(Note: PEP 0218 is very much a work in progress --- I just wanted to get some preliminary thoughts down so that my conscience wouldn't nag me quite so much... :-)
Well, another proof of the bugs in the PEP process -- my remarks were lost, so I'll send them here.
Where did you send your comments? Thanks, Greg

[Moshe Zadka]
Like everyone else, I have a Set class too. Its __getitem__ is # so "for x in set:" works # see also set.tolist() def __getitem__(self, i): if i == 0: self.keys = self.d.keys() return self.keys[i] Of course I don't *want* set[i] to mean anything at all; this is just a hack to work with the current iteration protocol (i.e., Set.__getitem__ makes no sense except in that it must be implemented so that "for x in set" works today). The one thing I could never get used to in Aaron's kjbuckets implementation of sets is that for/in did not work as expected.
* Two different constructors: set() for building from sequence, Set() for building from elements. Might be confusing.
My Set.__init__: def __init__(self, seq=[]): self.d = d = {} for x in seq: d[x] = 1 That is, the constructor takes a sequence. Same thing in Icon, which added a set type late in the game. No problem in practice.
This needs to be addressed, but several approaches are possible. For example, my Set class allows for Sets of Sets (& so on), because I needed them, and Sets are certainly mutable. Now immutable objects are required because dicts require immutable objects as keys. However, a class (like Set!) can fool dicts, by supplying a __hash__ and a __cmp__ "that work". Dicts can be compared directly, so __cmp__ is no problem. The difficulty comes with the __hash__. My solution was to "freeze" a Set the instant __hash__ was invoked: this allows mutation up until the point a hash is captured, and disallows it thereafter. Concretely, def __hash__(self): if self.frozen: hashcode = self.hashcode else: # The hash code must not depend on the order of the # keys. self.frozen = 1 hashcode = 0 _hash = hash for x in self.d.keys(): hashcode = hashcode ^ _hash(x) self.hashcode = hashcode return hashcode and all the mutating methods check self.frozen, raising ValueError if the target set is currently frozen. For example, # updating union def unionu(self, other): if self.frozen: raise ValueError(_frozenmsg) self.d.update(other.d) Oddly enough, I almost never got a "frozen" error in practice; it seems that by the time I was ready to build a set of sets, the constituent sets were at their final values; and the few times I did get a frozen error, it was actually a logic bug (I was mutating the set in error). It's hard to guess whether that's unique to me <0.5 wink>. Icon has a different approach entirely: an Icon set acts like a Python dict *would* act if you inserted the id()s of the keys rather than the keys themselves; that is, Icon's sets (and dicts) are based on object identity, not object value. Since object identity is invariant regardless of whether the object is mutable, a hash table implementation has no difficulties. I find identity semantics for sets less useful than value semantics, though. [Eric Raymond]
I'm afraid this is a good example of why a set type is unlikely to make it into the std distribution: whenever a data structure is added to Python, the agitation for variations is endless. For example, returning a list makes your flavor of sets unsuitable (because inefficient) for many applications (e.g., there's no efficient way to test a list for element membership). "Almost all" set implementations I've seen for Python use dicts for that reason. Another set of complaints will be due to some people wanting functional versions of the set operations, while others want mutating versions. My own Set class supplies both, because both are really needed in practice; e.g., the updating union was shown above; the functional union builds on that: # functional union def union(self, other): answer = self.__copy__() answer.unionu(other) return answer (subtlety: the functional union has no problem with "frozen" sets; Set.__copy__ always returns a melted set). Since it's Greg's PEP, it's up to him to make everyone happy <wink> -- but since there's so much variation possible, I'm not sure Guido will ever bless any Set class as "the" Set class. BTW, one lesson to take from SETL: a vital set operation in practice is a mutating "pick a 'random' element E from the set, remove it from the set, and return it". An enormous number of set algorithms are of the form while not S.empty(): pick some element from S deal with it, possibly mutating S This operation can't be done efficiently in Python code if the set is represented by a dict (the best you can do is materialize the full list of keys first, and pick one of those). That means my Set class often takes quadratic time for what *should* be linear-time algorithms. if-the-set-type-is-a-toy-there's-no-need-to-put-it-in-the- distribution-but-if-it's-not-a-toy-it-can't-be-written- in-python-today-ly y'rs - tim

Moshe Zadka <moshez@zadka.site.co.il>:
I agree. I wrote a full-featured set library long ago. All it's waiting for is rich comparisons. Wait...are those in 2.0? -- <a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a> It will be of little avail to the people, that the laws are made by men of their own choice, if the laws be so voluminous that they cannot be read, or so incoherent that they cannot be understood; if they be repealed or revised before they are promulgated, or undergo such incessant changes that no man, who knows what the law is to-day, can guess what it will be to-morrow. Law is defined to be a rule of action; but how can that be a rule, which is little known, and less fixed? -- James Madison, Federalist Papers 62

On Mon, 27 Nov 2000, Eric S. Raymond wrote:
I agree. I wrote a full-featured set library long ago. All it's waiting for is rich comparisons. Wait...are those in 2.0?
Nope. I'm not even sure if they're going to be in 2.1. DA? Anyway, can you at least give the URL? I wrote something like that too (is this another Python ritual), and I'd be happy to try and integrate them for 2.1. -- Moshe Zadka <moshez@math.huji.ac.il> -- 95855124 http://advogato.org/person/moshez

[Moshe, on rich comparisons]
Nope. I'm not even sure if they're going to be in 2.1. DA?
Yes, they should go into 2.1. Moshe, you are listed as co-author of PEP 207. What's up? Also, I thought you were going to co-author PEP 208 (Reworking the Coercion Model). What's up with that? Both are empty! --Guido van Rossum (home page: http://www.python.org/~guido/)

Hi, folks. Is there a standard or usual way to provide version information inside a Python module or class? Special tokens in the doc string, a method like "getVersion"...? Thanks, Greg

Greg Wilson writes:
Modules sometimes offer a "__version__" attribute, which I've also seen called "version" or "VERSION" in strange cases (xmllib and (I think) PIL). -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Digital Creations

Greg Wilson wrote:
AFAIK, __version__ with a string value is in common usage both in modules and classes. BTW, while we're at it: with the growing number of dependencies between modules, packages and the Python lib version... how about creating a standard to enable versioning in module/package imports ? It would have to meet (at least) these requirements: * import of a specific version * alias of the unversioned name to the most recent version available Thoughts ? -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

AFAIK, __version__ with a string value is in common usage both in modules and classes.
Correct. This was agreed upon as a standard long ago. It's probably not documented anywhere as such.
This is a major rathole. I don't know of any other language or system that has solved this in a satisfactory way. Most shared library implementations have had to deal with this, but it's generally a very painful manual process to ensure compatibility. I would rather stay out of trying to solve this for Python in a generic way -- when specific package develop incompatible versions, it's up to the package authors to come up with a backwards compatibility strategy. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
Probably should be... along with some other __special_name__ which are in use, e.g. __copyright__ or __author__. I think it would be a good idea to add these attribute names as virtually "reserved" attribute names.
Well, I think it should at least be possible to install multiple versions of the same packages in some way which makes it possible for other programs to choose which version to take. With packages and the default import mechanism this is possible using some caching and __path__ trickery -- I think Pmw does this. The new "from ... import ... as" also provides ways for this: from mx import DateTime160 as DateTime The problem with the latter approach is that pickles will use the versioned name and thus won't port easily to later versions. But you're probably right: only the package itself will know if it's ok to use the newer version or not. Would still be nice if we had some kind of documented naming scheme for versioned modules and packages though, so that the user will at least recognize these as being versioned import names. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

On 27 November 2000, Guido van Rossum said:
I think that the "Distributing Python Modules" manual should have a "Recommended Practices" section to cover things like this. My personal convention is that every single source file has a __revision__ variable, and the "root" module [1] of a module distribution has __version__. [1] the "root module" is the highest-level __init__.py in a package-ized distribution, or the module itself in a single-file distribution. Most non-packageized multi-module distributions have a fairly obvious place to put __version__. Greg

On Mon, 27 Nov 2000, Guido van Rossum wrote:
Ooops, the PEP-0207 thing is a mistake -- I meant to change 0208. No time now, I'll change 0208 back to DA later tonight. (Sorry for being so israelo-centric <wink>). The 0208 PEP should get done before the weekend. (Your weekend, not mine). -- Moshe Zadka <moshez@math.huji.ac.il> -- 95855124 http://advogato.org/person/moshez

Guido van Rossum <guido@python.org>:
Good. Within a few hours after I see that feature in the beta I'll have a very rich Set class for the standard library. It includes all the standard boolean operations, symmetric difference, powerset, and useful overloads for most of the standard operators. As the header comment says: # A set-algebra module for Python # # The functions work on any sequence type and return lists. # The set methods can take a set or any sequence type as an argument. Most of the code is tested already. All it's waiting on is rich comparisons so I can handle partial-ordering of sets properly. -- <a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a> When only cops have guns, it's called a "police state". -- Claire Wolfe, "101 Things To Do Until The Revolution"

(Note: PEP 0218 is very much a work in progress --- I just wanted to get some preliminary thoughts down so that my conscience wouldn't nag me quite so much... :-)
Well, another proof of the bugs in the PEP process -- my remarks were lost, so I'll send them here.
Where did you send your comments? Thanks, Greg

[Moshe Zadka]
Like everyone else, I have a Set class too. Its __getitem__ is # so "for x in set:" works # see also set.tolist() def __getitem__(self, i): if i == 0: self.keys = self.d.keys() return self.keys[i] Of course I don't *want* set[i] to mean anything at all; this is just a hack to work with the current iteration protocol (i.e., Set.__getitem__ makes no sense except in that it must be implemented so that "for x in set" works today). The one thing I could never get used to in Aaron's kjbuckets implementation of sets is that for/in did not work as expected.
* Two different constructors: set() for building from sequence, Set() for building from elements. Might be confusing.
My Set.__init__: def __init__(self, seq=[]): self.d = d = {} for x in seq: d[x] = 1 That is, the constructor takes a sequence. Same thing in Icon, which added a set type late in the game. No problem in practice.
This needs to be addressed, but several approaches are possible. For example, my Set class allows for Sets of Sets (& so on), because I needed them, and Sets are certainly mutable. Now immutable objects are required because dicts require immutable objects as keys. However, a class (like Set!) can fool dicts, by supplying a __hash__ and a __cmp__ "that work". Dicts can be compared directly, so __cmp__ is no problem. The difficulty comes with the __hash__. My solution was to "freeze" a Set the instant __hash__ was invoked: this allows mutation up until the point a hash is captured, and disallows it thereafter. Concretely, def __hash__(self): if self.frozen: hashcode = self.hashcode else: # The hash code must not depend on the order of the # keys. self.frozen = 1 hashcode = 0 _hash = hash for x in self.d.keys(): hashcode = hashcode ^ _hash(x) self.hashcode = hashcode return hashcode and all the mutating methods check self.frozen, raising ValueError if the target set is currently frozen. For example, # updating union def unionu(self, other): if self.frozen: raise ValueError(_frozenmsg) self.d.update(other.d) Oddly enough, I almost never got a "frozen" error in practice; it seems that by the time I was ready to build a set of sets, the constituent sets were at their final values; and the few times I did get a frozen error, it was actually a logic bug (I was mutating the set in error). It's hard to guess whether that's unique to me <0.5 wink>. Icon has a different approach entirely: an Icon set acts like a Python dict *would* act if you inserted the id()s of the keys rather than the keys themselves; that is, Icon's sets (and dicts) are based on object identity, not object value. Since object identity is invariant regardless of whether the object is mutable, a hash table implementation has no difficulties. I find identity semantics for sets less useful than value semantics, though. [Eric Raymond]
I'm afraid this is a good example of why a set type is unlikely to make it into the std distribution: whenever a data structure is added to Python, the agitation for variations is endless. For example, returning a list makes your flavor of sets unsuitable (because inefficient) for many applications (e.g., there's no efficient way to test a list for element membership). "Almost all" set implementations I've seen for Python use dicts for that reason. Another set of complaints will be due to some people wanting functional versions of the set operations, while others want mutating versions. My own Set class supplies both, because both are really needed in practice; e.g., the updating union was shown above; the functional union builds on that: # functional union def union(self, other): answer = self.__copy__() answer.unionu(other) return answer (subtlety: the functional union has no problem with "frozen" sets; Set.__copy__ always returns a melted set). Since it's Greg's PEP, it's up to him to make everyone happy <wink> -- but since there's so much variation possible, I'm not sure Guido will ever bless any Set class as "the" Set class. BTW, one lesson to take from SETL: a vital set operation in practice is a mutating "pick a 'random' element E from the set, remove it from the set, and return it". An enormous number of set algorithms are of the form while not S.empty(): pick some element from S deal with it, possibly mutating S This operation can't be done efficiently in Python code if the set is represented by a dict (the best you can do is materialize the full list of keys first, and pick one of those). That means my Set class often takes quadratic time for what *should* be linear-time algorithms. if-the-set-type-is-a-toy-there's-no-need-to-put-it-in-the- distribution-but-if-it's-not-a-toy-it-can't-be-written- in-python-today-ly y'rs - tim
participants (9)
-
Eric S. Raymond
-
Fred L. Drake, Jr.
-
Greg Ward
-
Greg Wilson
-
Guido van Rossum
-
M.-A. Lemburg
-
Moshe Zadka
-
Moshe Zadka
-
Tim Peters