license issues with profiler.py and md5.h/md5c.c

A Debian user pointed out (http://bugs.debian.org/293932), that the current license for the Python profiler is not conforming to the DFSG (Debian free software guidelines). http://www.python.org/doc/current/lib/node829.html states "This permission is explicitly restricted to the copying and modification of the software to remain in Python, compiled Python, or other languages (such as C) wherein the modified or derived code is exclusively imported into a Python module." The DFSG, http://www.debian.org/doc/debian-policy/ch-archive.html#s-dfsg, third paragraph state: "Derived Works The license must allow modifications and derived works, and must allow them to be distributed under the same terms as the license of the original software." - Does somebody knows about the history of this license, why it is more restricted than the Python license? - Is there a chance to change the license for these two modules (profile.py, pstats.py)? The md5.h/md5c.c files allow "copy and use", but no modification of the files. There are some alternative implementations, i.e. in glibc, openssl, so a replacement should be sage. Any other requirements when considering a replacement? Matthias

The md5.h/md5c.c files allow "copy and use", but no modification of the files. There are some alternative implementations, i.e. in glibc, openssl, so a replacement should be sage. Any other requirements when considering a replacement?
Matthias
I believe the "plan" for md5 and sha1 and such is to use the much faster openssl versions "in the future" (based on a long thread debating future interfaces to such things on python-dev last summer). That'll sidestep any tedious license issue and give a better implementation at the same time. i don't believe anyone has taken the time to make such a patch yet. -g

On Tue, 2005-02-08 at 11:52 -0800, Gregory P. Smith wrote:
The md5.h/md5c.c files allow "copy and use", but no modification of the files. There are some alternative implementations, i.e. in glibc, openssl, so a replacement should be sage. Any other requirements when considering a replacement?
One thing to consider is "degree of difficulty" :-)
Matthias
I believe the "plan" for md5 and sha1 and such is to use the much faster openssl versions "in the future" (based on a long thread debating future interfaces to such things on python-dev last summer). That'll sidestep any tedious license issue and give a better implementation at the same time. i don't believe anyone has taken the time to make such a patch yet.
I wasn't around for that discussion. There are two viable replacements for the RSA implementation currently used; libmd <http://www.penguin.cz/~mhi/libmd/> openssl <http://www.openssl.org/>. The libmd implementation is by Colin Plumb and has the licence; "This code is in the public domain; do with it what you wish." The API is identical to the RSA implementation and BSD world's libmd and hence is a drop in replacement. This implementation is faster than the RSA implementation. The openssl implementation has an apache style license. The API is almost the same but slightly different to the RSA API, so it would require a little bit of work to make it fit. This implementation is the fastest currently available, as it includes many platform specific optimisations for a large range of platforms. Currently md5c.c is included in the python sources. The libmd implementation has a drop in replacement for md5c.c. The openssl implementation is a complicated tangle of Makefile expanded template code that would be harder to include in the Python sources. In the Linux world, openssl is starting to become ubiquitous, so not including it and statically or even dynamically linking against it is feasible. However, using Python in other lands will probably require something to be included. Long term, I think openssl is the way to go. Short term, libmd is a painless replacement that gets around the licencing issues. I have been using the libmd API stuff for md4 in librsync, and am looking at migrating to the openssl API. If people hassle me, I could probably do the openssl API migration for Python, but I'm not sure what the best approach would be to including the source in Python sources. FWIW, I also have an md4sum module and md4c.c implementation that I'm happy to contribute to Python (done for pysysnc). -- Donovan Baarda <abo@minkirri.apana.org.au> http://minkirri.apana.org.au/~abo/

On Feb 10, 2005, at 9:15 PM, Donovan Baarda wrote:
On Tue, 2005-02-08 at 11:52 -0800, Gregory P. Smith wrote:
The md5.h/md5c.c files allow "copy and use", but no modification of the files. There are some alternative implementations, i.e. in glibc, openssl, so a replacement should be sage. Any other requirements when considering a replacement?
One thing to consider is "degree of difficulty" :-)
Matthias
I believe the "plan" for md5 and sha1 and such is to use the much faster openssl versions "in the future" (based on a long thread debating future interfaces to such things on python-dev last summer). That'll sidestep any tedious license issue and give a better implementation at the same time. i don't believe anyone has taken the time to make such a patch yet.
I wasn't around for that discussion. There are two viable replacements for the RSA implementation currently used;
libmd <http://www.penguin.cz/~mhi/libmd/> openssl <http://www.openssl.org/>. -- In the Linux world, openssl is starting to become ubiquitous, so not including it and statically or even dynamically linking against it is feasible. However, using Python in other lands will probably require something to be included.
Long term, I think openssl is the way to go. Short term, libmd is a painless replacement that gets around the licencing issues.
OpenSSL is also ubiquitous on Mac OS X (as a shared lib): Mac OS X 10.2.8 has OpenSSL 0.9.6i Feb 19 2003 Mac OS X 10.3.8 has OpenSSL 0.9.7b 10 Apr 2003 One possible alternative would be to bring in something like PyOpenSSL <http://pyopenssl.sourceforge.net/> and just rewrite the md5 (and sha?) extensions as Python modules that use that API. -bob

On Thu, 2005-02-10 at 21:30 -0500, Bob Ippolito wrote:
On Feb 10, 2005, at 9:15 PM, Donovan Baarda wrote:
On Tue, 2005-02-08 at 11:52 -0800, Gregory P. Smith wrote: [...] One possible alternative would be to bring in something like PyOpenSSL <http://pyopenssl.sourceforge.net/> and just rewrite the md5 (and sha?) extensions as Python modules that use that API.
Only problem with this, is pyopenssl doesn't yet include any mdX or sha modules. -- Donovan Baarda <abo@minkirri.apana.org.au> http://minkirri.apana.org.au/~abo/

On Feb 10, 2005, at 9:50 PM, Donovan Baarda wrote:
On Thu, 2005-02-10 at 21:30 -0500, Bob Ippolito wrote:
On Feb 10, 2005, at 9:15 PM, Donovan Baarda wrote:
On Tue, 2005-02-08 at 11:52 -0800, Gregory P. Smith wrote: [...] One possible alternative would be to bring in something like PyOpenSSL <http://pyopenssl.sourceforge.net/> and just rewrite the md5 (and sha?) extensions as Python modules that use that API.
Only problem with this, is pyopenssl doesn't yet include any mdX or sha modules.
My bad, how about M2Crypto <http://sandbox.rulemaker.net/ngps/m2/> then? This one supports message digests and is more license compatible with Python to boot. -bob

On Thu, 2005-02-10 at 23:13 -0500, Bob Ippolito wrote:
On Feb 10, 2005, at 9:50 PM, Donovan Baarda wrote:
On Thu, 2005-02-10 at 21:30 -0500, Bob Ippolito wrote: [...] Only problem with this, is pyopenssl doesn't yet include any mdX or sha modules.
My bad, how about M2Crypto <http://sandbox.rulemaker.net/ngps/m2/> then? This one supports message digests and is more license compatible with Python to boot. [...]
This one does have md5 support, but the Python API is rather different from the current python md5sum API. It hooks into the slightly higher level MVP openssl layer, rather than the lower level md5 layer. Hooking into the MVP layer pretty much requires including all the openssl message digest implementations (which may or may not be a good idea). It also uses SWIG to generate the extension module. I don't think anything else in Python itself uses SWIG, so starting to use it would introduce a "Build Dependency". I think it would be cleaner and simpler to modify the existing md5module.c to use the openssl md5 layer API (this is just a search/replace to change the function names). The bigger problem is deciding what/how/whether to include the openssl md5 implementation sources so that win32 can use them. -- Donovan Baarda <abo@minkirri.apana.org.au> http://minkirri.apana.org.au/~abo/

On Fri, 2005-02-11 at 17:15 +1100, Donovan Baarda wrote: [...]
I think it would be cleaner and simpler to modify the existing md5module.c to use the openssl md5 layer API (this is just a search/replace to change the function names). The bigger problem is deciding what/how/whether to include the openssl md5 implementation sources so that win32 can use them.
Thinking about it, probably the best way is to include the libmd md5c.c modified to use the openssl API, and then use configure to check for and use openssl if it is available. That way win32 could use the provided md5c.c, and other platforms could use the faster openssl. -- Donovan Baarda <abo@minkirri.apana.org.au> http://minkirri.apana.org.au/~abo/

I think it would be cleaner and simpler to modify the existing md5module.c to use the openssl md5 layer API (this is just a search/replace to change the function names). The bigger problem is deciding what/how/whether to include the openssl md5 implementation sources so that win32 can use them.
yes, that is all i was suggesting. win32 python is already linked against openssl for the socket module ssl support, having the md5 and sha1 modules depend on openssl should not cause a problem. -greg

G'day again, From: "Gregory P. Smith" <greg@electricrain.com>
I think it would be cleaner and simpler to modify the existing md5module.c to use the openssl md5 layer API (this is just a search/replace to change the function names). The bigger problem is deciding what/how/whether to include the openssl md5 implementation sources so that win32 can use them.
yes, that is all i was suggesting.
win32 python is already linked against openssl for the socket module ssl support, having the md5 and sha1 modules depend on openssl should not cause a problem.
IANAL... I have too much common sense, so I won't argue licences :-) So is openssl already included in the Python sources, or is it just a dependency? I had a quick look and couldn't find it so it must be a dependency. Given that Python is already dependant on openssl, it makes sense to change md5sum to use it. I have a feeling that openssl internally uses md5, so this way we wont link against two different md5sum implementations. ---------------------------------------------------------------- Donovan Baarda http://minkirri.apana.org.au/~abo/ ----------------------------------------------------------------

On Feb 11, 2005, at 6:11 PM, Donovan Baarda wrote:
G'day again,
From: "Gregory P. Smith" <greg@electricrain.com>
I think it would be cleaner and simpler to modify the existing md5module.c to use the openssl md5 layer API (this is just a search/replace to change the function names). The bigger problem is deciding what/how/whether to include the openssl md5 implementation sources so that win32 can use them.
yes, that is all i was suggesting.
win32 python is already linked against openssl for the socket module ssl support, having the md5 and sha1 modules depend on openssl should not cause a problem.
IANAL... I have too much common sense, so I won't argue licences :-)
So is openssl already included in the Python sources, or is it just a dependency? I had a quick look and couldn't find it so it must be a dependency.
Given that Python is already dependant on openssl, it makes sense to change md5sum to use it. I have a feeling that openssl internally uses md5, so this way we wont link against two different md5sum implementations.
It is an optional dependency that is used when present (read: not just win32). The sources are not included with Python. OpenSSL does internally have an implementation of md5 (and sha1, among other things). -bob

G'day, From: "Bob Ippolito" <bob@redivi.com>
On Feb 11, 2005, at 6:11 PM, Donovan Baarda wrote: [...]
Given that Python is already dependant on openssl, it makes sense to change md5sum to use it. I have a feeling that openssl internally uses md5, so this way we wont link against two different md5sum implementations.
It is an optional dependency that is used when present (read: not just win32). The sources are not included with Python.
Are there any potential problems with making the md5sum module availability "optional" in the same way as this?
OpenSSL does internally have an implementation of md5 (and sha1, among other things).
Yeah, I know, that's why it could be used for the md5sum module :-) What I meant was a Python application using ssl sockets and the md5sum module will effectively have two different md5sum implementations in memory. Using the openssl md5sum for the md5sum module will make it "leaner", as well as faster. ---------------------------------------------------------------- Donovan Baarda http://minkirri.apana.org.au/~abo/ ----------------------------------------------------------------

On Sat, Feb 12, 2005 at 01:54:27PM +1100, Donovan Baarda wrote:
Are there any potential problems with making the md5sum module availability "optional" in the same way as this?
The md5 module has been a standard module for a long time; making it optional in the next version of Python isn't possible. We'd have to require OpenSSL to compile Python. I'm happy to replace the MD5 and/or SHA implementations with other code, provided other code with a suitable license can be found. --amk

A.M. Kuchling wrote:
On Sat, Feb 12, 2005 at 01:54:27PM +1100, Donovan Baarda wrote:
Are there any potential problems with making the md5sum module availability "optional" in the same way as this?
The md5 module has been a standard module for a long time; making it optional in the next version of Python isn't possible. We'd have to require OpenSSL to compile Python.
I'm happy to replace the MD5 and/or SHA implementations with other code, provided other code with a suitable license can be found.
How about this one: http://sourceforge.net/project/showfiles.php?group_id=42360 From an API standpoint, it's trivially different from the one currently in Python. From md5.c: /* Copyright (C) 1999, 2000, 2002 Aladdin Enterprises. All rights reserved. This software is provided 'as-is', without any express or implied warranty. In no event will the authors be held liable for any damages arising from the use of this software. Permission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions: 1. The origin of this software must not be misrepresented; you must not claim that you wrote the original software. If you use this software in a product, an acknowledgment in the product documentation would be appreciated but is not required. 2. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software. 3. This notice may not be removed or altered from any source distribution. L. Peter Deutsch ghost@aladdin.com */ /* $Id: md5.c,v 1.6 2002/04/13 19:20:28 lpd Exp $ */ /* Independent implementation of MD5 (RFC 1321). This code implements the MD5 Algorithm defined in RFC 1321, whose text is available at http://www.ietf.org/rfc/rfc1321.txt The code is derived from the text of the RFC, including the test suite (section A.5) but excluding the rest of Appendix A. It does not include any code or documentation that is identified in the RFC as being copyrighted. [etc.] -- Robert Kern rkern@ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter

On Sat, Feb 12, 2005 at 08:37:21AM -0500, A.M. Kuchling wrote:
On Sat, Feb 12, 2005 at 01:54:27PM +1100, Donovan Baarda wrote:
Are there any potential problems with making the md5sum module availability "optional" in the same way as this?
The md5 module has been a standard module for a long time; making it optional in the next version of Python isn't possible. We'd have to require OpenSSL to compile Python.
I'm happy to replace the MD5 and/or SHA implementations with other code, provided other code with a suitable license can be found.
agreed. it can not be made optional. What I'd prefer (and will do if i find the time) is to have the md5 and sha1 module use OpenSSLs implementations when available. Falling back to their built in ones when openssl isn't present. That way its always there but uses the much faster optimized openssl algorithms when they exist. -g

G'day, On Sat, 2005-02-12 at 13:04 -0800, Gregory P. Smith wrote:
On Sat, Feb 12, 2005 at 08:37:21AM -0500, A.M. Kuchling wrote:
On Sat, Feb 12, 2005 at 01:54:27PM +1100, Donovan Baarda wrote:
Are there any potential problems with making the md5sum module availability "optional" in the same way as this?
The md5 module has been a standard module for a long time; making it optional in the next version of Python isn't possible. We'd have to require OpenSSL to compile Python.
I'm happy to replace the MD5 and/or SHA implementations with other code, provided other code with a suitable license can be found.
agreed. it can not be made optional. What I'd prefer (and will do if i find the time) is to have the md5 and sha1 module use OpenSSLs implementations when available. Falling back to their built in ones when openssl isn't present. That way its always there but uses the much faster optimized openssl algorithms when they exist.
So we need a fallback md5 implementation for when openssl is not available. The RSA implementation is not usable because it has an unsuitable license. Looking at this licence again, I'm not sure what the problem is. It allows you to freely modify, distribute, etc, with the only limit you must retain the RSA licence blurb. The libmd implementation cannot be used because the author tried to give it away unconditionally, and the lawyers say you can't. (dumb! dumb! dumb! someone needs to figure out a way to systematically get around this kind of stupidity, perhaps have someone in a less legally stupid country claim and re-license free code). The libmd5-rfc sourceforge project implementation <http://sourceforge.net/projects/libmd5-rfc/> looks OK. It needs to be modified to have an API identical to openssl (rename structures/functions). Then setup.py needs to be modified to use openssl if available, or fallback to the provided libmd5-rfc implementation. The SHA module is a bit different... it includes a built in SHA implementation. It might pay to strip out the implementation and give it an openssl-like API, then make shamodule.c a use it, or openssl if available. Greg Smith might have already done much of this... -- Donovan Baarda <abo@minkirri.apana.org.au> http://minkirri.apana.org.au/~abo/

fyi - i've updated the python sha1/md5 openssl patch. it now replaces the entire sha and md5 modules with a generic hashes module that gives access to all of the hash algorithms supported by OpenSSL (including appropriate legacy interface wrappers and falling back to the old code when compiled without openssl). https://sourceforge.net/tracker/index.php?func=detail&aid=1121611&group_id=5470&atid=305470 I don't quite like the module name 'hashes' that i chose for the generic interface (too close to the builtin hash() function). Other suggestions on a module name? 'digest' comes to mind. -greg

"Gregory P. Smith" wrote:
I don't quite like the module name 'hashes' that i chose for the generic interface (too close to the builtin hash() function). Other suggestions on a module name? 'digest' comes to mind.
hashtools, hashlib, and _hash are common names for helper modules like this. (you still provide md5 and sha wrappers, I hope) </F>

On Wed, 2005-02-16 at 22:53 -0800, Gregory P. Smith wrote:
fyi - i've updated the python sha1/md5 openssl patch. it now replaces the entire sha and md5 modules with a generic hashes module that gives access to all of the hash algorithms supported by OpenSSL (including appropriate legacy interface wrappers and falling back to the old code when compiled without openssl).
https://sourceforge.net/tracker/index.php?func=detail&aid=1121611&group_id=5470&atid=305470
I don't quite like the module name 'hashes' that i chose for the generic interface (too close to the builtin hash() function). Other suggestions on a module name? 'digest' comes to mind.
I just had a quick look, and have these comments (psedo patch review?). Apologies for the noise on the list... DESCRIPTION =========== This patch keeps the current md5c.c, md5module.c files and adds the following; _hashopenssl.c, hashes.py, md5.py, sha.py. The old md5 and sha extension modules get replaced by hashes.py, md5.py, and sha.py python modules that leverage off _hash (openssl) or _md5 and _sha (no openssl) extension modules. The new _hash extension module "wraps" the high level openssl EVP interface, which uses a string parameter to indicate what type of message digest algorithm to use. The advantage of this is it makes all openssl supported digests available, and if openssl adds more, we get them for free. A disadvantage of this is it is an abstraction level above the actual md5 and sha implementations, and this may add overheads. These overheads are probably negligible compared to the actual implementation speedups. The new _md5 and _sha extension modules are simply re-named versions of the old md5 and sha modules. The hashes.py module acts as an import wrapper for _hash, and falls back to using _md5 and _sha modules if _hash is not available. It provides an EVP style API (string hash name parameter), that supports only md5 and sha hashes if openssl is not available. The new md5.py and sha.py modules simply use hash.py. COMMENTS ======== The introduction of a "hashes" module with a new API that supports many different digests (provided openssl is available) is extending Python, not just "fixing the licenses" of md5 and sha modules. If all we wanted to do was fix the md5 module, a simpler solution would be to change the md5c.c API to match openssl's implementation, and make md5module.c use it, conditionally compiling against md5c.c or linking against openssl in setup.py. A similar approach could be used for sha, but would require stripping the sha implementation out of shamodule.c I am mildly of concerned about the namespace/filespace clutter introduced by this implementation... it feels unnecessary, as does the tangled dependencies between them. With openssl, hashes.py duplicates the functionality of _hash. Without openssl, md5.py and sha.py duplicate _md5 and _sha, via a roundabout route through hash.py. The python wrappers seem overly complicated, with things like def new(name, string=None): if string: return _hash.new(name) else: return _hash.new.(name,string) being common where the following would suffice; def new(name,string=""): return _hash.new(name,string) I think this is because _hash.new() uses an optional string parameter, but I have a feeling a C update with a zero length string is faster than this Python if. If it was a concern, the C implementation could check the value of the string length before calling update. Given the convenience methods for different hashes in hashes.py (which incidentally look like they are only available when _hash is not available... something else that needs fixing), the md5.py module could be simply coded as; from hashes import md5 new = md5 Despite all these nit-picks, it looks pretty good. It is orders of magnitude better than any of the other non-existent solutions, including the one I didn't code :-) -- Donovan Baarda <abo@minkirri.apana.org.au> http://minkirri.apana.org.au/~abo/

Donovan Baarda wrote:
This patch keeps the current md5c.c, md5module.c files and adds the following; _hashopenssl.c, hashes.py, md5.py, sha.py. [...] If all we wanted to do was fix the md5 module
If we want to fix the licensing issues with the md5 module, this patch does not help at all, as it keeps the current md5 module (along with its licensing issues). So any patch to solve the problem will need to delete the code with the questionable license. Then, the approach in the patch breaks the promise that the md5 module is always there. It would require that OpenSSL is always there - a promise that we cannot make (IMO). Regards, Martin

From: "Martin v. Löwis" <martin@v.loewis.de>
Donovan Baarda wrote:
This patch keeps the current md5c.c, md5module.c files and adds the following; _hashopenssl.c, hashes.py, md5.py, sha.py. [...] If all we wanted to do was fix the md5 module
If we want to fix the licensing issues with the md5 module, this patch does not help at all, as it keeps the current md5 module (along with its licensing issues). So any patch to solve the problem will need to delete the code with the questionable license.
It maybe half fixes it in that if Python is happy with the RSA one, they can continue to include it, and if Debian is unhappy with it, they can remove it and build against openssl. It doesn't fully fix the license problem. It is still worth considering because it doesn't make it worse, and it does allow Python to use much faster implementations and support other digest algorithms when openssl is available.
Then, the approach in the patch breaks the promise that the md5 module is always there. It would require that OpenSSL is always there - a promise that we cannot make (IMO).
It would be better if found an alternative md5c.c. I found one that was the libmd implementation that someone mildly tweaked and then slapped an LGPL on. I have a feeling that would make the lawyers tremble more than the "public domain" libmd one, unless they are happy that someone else is prepared to wear the grief for slapping a LGPL onto something public domain. Probably the best at the moment is the sourceforge one, which is listed as having a "zlib/libpng licence". ---------------------------------------------------------------- Donovan Baarda http://minkirri.apana.org.au/~abo/ ----------------------------------------------------------------

On Fri, Feb 18, 2005 at 10:06:24AM +0100, "Martin v. L?wis" wrote:
Donovan Baarda wrote:
This patch keeps the current md5c.c, md5module.c files and adds the following; _hashopenssl.c, hashes.py, md5.py, sha.py. [...] If all we wanted to do was fix the md5 module
If we want to fix the licensing issues with the md5 module, this patch does not help at all, as it keeps the current md5 module (along with its licensing issues). So any patch to solve the problem will need to delete the code with the questionable license.
Then, the approach in the patch breaks the promise that the md5 module is always there. It would require that OpenSSL is always there - a promise that we cannot make (IMO).
I'm aware of that. My goals are primarily to get a good openssl based hashes/digest module going to be used instead of the built in implementations when openssl available because openssl is -so- much faster. Fixing the debian instigated md5 licensing issue is secondary and is something I'll get to later on after i work on the fun stuff. And as Donovan has said, the patch already does present debian with the option of dropping that md5 module and using the openssl derived one instead if they're desperate. based on laziness winning and the issue being so minor i hope they just wait for a patch from me that replaces the md5c.c with one of the acceptably licensed ones for their 2.3/2.4 packages. -g

Gregory P. Smith wrote:
fyi - i've updated the python sha1/md5 openssl patch. it now replaces the entire sha and md5 modules with a generic hashes module that gives access to all of the hash algorithms supported by OpenSSL (including appropriate legacy interface wrappers and falling back to the old code when compiled without openssl).
https://sourceforge.net/tracker/index.php?func=detail&aid=1121611&group_id=5470&atid=305470
I don't quite like the module name 'hashes' that i chose for the generic interface (too close to the builtin hash() function). Other suggestions on a module name? 'digest' comes to mind.
'hashtools' and 'hashlib' would both have precedents in the standard library (itertools and urllib, for example). It occurs to me that such a module would provide a way to fix the bug with incorrectly hashable instances of new-style classes: Py> class C: ... def __eq__(self, other): return True ... Py> hash(C()) Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: unhashable instance Py> class C(object): ... def __eq__(self, other): return True ... Py> hash(C()) 10357232 Guido wanted to fix this by eliminating object.__hash__, but that caused problems for Jython. If I remember that discussion correctly, the problem was that, in Jython, the default hash is _not_ simply hash(id(obj)) the way it is in CPython, so Python code needs a way to get access to the default implementation. A hashtools.default_hash that worked like the current object.__hash__ would seem to provide such a spelling, and allow object.__hash__ to be removed (fixing the above bug). Cheers, Nick. -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net

I've created an OpenSSL version of the sha module. trivial to modify to be a md5 module. Its a first version with cleanup to be done and such. being managed in the SF patch manager: https://sourceforge.net/tracker/?func=detail&aid=1121611&group_id=5470&atid=305470 enjoy. i'll do more cleanup and work on it soon.

On Sat, 2005-02-12 at 17:35 -0800, Gregory P. Smith wrote:
I've created an OpenSSL version of the sha module. trivial to modify to be a md5 module. Its a first version with cleanup to be done and such. being managed in the SF patch manager:
https://sourceforge.net/tracker/?func=detail&aid=1121611&group_id=5470&atid=305470
enjoy. i'll do more cleanup and work on it soon.
Hmmm. I see the patch entry, but it seems to be missing the actual patch. Did you code this from scratch, or did you base it on the current md5module.c? Is it using the openssl sha interface, or the higher level EVP interface? The reason I ask is it would be pretty trivial to modify md5module.c to use the openssl API for any digest, and would be less risk than fresh-coding one. -- Donovan Baarda <abo@minkirri.apana.org.au> http://minkirri.apana.org.au/~abo/

On Mon, Feb 14, 2005 at 11:02:23AM +1100, Donovan Baarda wrote:
On Sat, 2005-02-12 at 17:35 -0800, Gregory P. Smith wrote:
I've created an OpenSSL version of the sha module. trivial to modify to be a md5 module. Its a first version with cleanup to be done and such. being managed in the SF patch manager:
https://sourceforge.net/tracker/?func=detail&aid=1121611&group_id=5470&atid=305470
enjoy. i'll do more cleanup and work on it soon.
Hmmm. I see the patch entry, but it seems to be missing the actual patch.
Did you code this from scratch, or did you base it on the current md5module.c? Is it using the openssl sha interface, or the higher level EVP interface?
The reason I ask is it would be pretty trivial to modify md5module.c to use the openssl API for any digest, and would be less risk than fresh-coding one.
Ugh. Sourceforge ignored it on the patch submission. i've attached it properly now. This initial version is derived from shamodule.c which does not have any license issues. it is currently only meant as an example of how easy it is to use the openssl hashing interface. I'm taking it an turning it into a generic openssl hash wrapper that'll do md5 sha1 and anything else. -g

Donovan Baarda writes:
On Tue, 2005-02-08 at 11:52 -0800, Gregory P. Smith wrote:
The md5.h/md5c.c files allow "copy and use", but no modification of the files. There are some alternative implementations, i.e. in glibc, openssl, so a replacement should be sage. Any other requirements when considering a replacement?
One thing to consider is "degree of difficulty" :-)
Matthias
I believe the "plan" for md5 and sha1 and such is to use the much faster openssl versions "in the future" (based on a long thread debating future interfaces to such things on python-dev last summer). That'll sidestep any tedious license issue and give a better implementation at the same time. i don't believe anyone has taken the time to make such a patch yet.
I wasn't around for that discussion. There are two viable replacements for the RSA implementation currently used;
libmd <http://www.penguin.cz/~mhi/libmd/> openssl <http://www.openssl.org/>.
The libmd implementation is by Colin Plumb and has the licence; "This code is in the public domain; do with it what you wish." The API is identical to the RSA implementation and BSD world's libmd and hence is a drop in replacement. This implementation is faster than the RSA implementation.
[...]
Currently md5c.c is included in the python sources. The libmd implementation has a drop in replacement for md5c.c. The openssl implementation is a complicated tangle of Makefile expanded template code that would be harder to include in the Python sources.
I would prefer that one as a short term solution. Patch at #1118602.

On Fri, 11 Feb 2005 12:55:02 +0100, Matthias Klose <doko@cs.tu-berlin.de> wrote:
Currently md5c.c is included in the python sources. The libmd implementation has a drop in replacement for md5c.c. The openssl implementation is a complicated tangle of Makefile expanded template code that would be harder to include in the Python sources.
I would prefer that one as a short term solution. Patch at #1118602.
Unfortunately a license that says it is in the public domain is unacceptable (and should be for Debian, too). That is to say, it's not possible for someone to claim that something they produce is in the public domain. See http://www.linuxjournal.com/article/6225 Jeremy

[Matthias Klose]
A Debian user pointed out (http://bugs.debian.org/293932), that the current license for the Python profiler is not conforming to the DFSG (Debian free software guidelines).
http://www.python.org/doc/current/lib/node829.html states
"This permission is explicitly restricted to the copying and modification of the software to remain in Python, compiled Python, or other languages (such as C) wherein the modified or derived code is exclusively imported into a Python module." ... - Does somebody knows about the history of this license, why it is more restricted than the Python license?
Simply because that's the license Jim Roskind slapped on it when he contributed this code 10 years ago. I imagine (but don't know) that Guido looked at it, thought "hmm -- shouldn't be a problem for Python's users", and so accepted it.
- Is there a chance to change the license for these two modules (profile.py, pstats.py)?
Not unless some remnant of InfoSeek Corp can be found, since they're the copyright holder (their work, their license). Alas, Jim Roskind hasn't been seen in the Python world this century. OTOH, if InfoSeek has vanished, it's unlikely they'll be suing anyone. Given how Python-specific profile.py and pstats.py are, it's hard for me to imagine anyone wanting to make a derivative that isn't imported into a Python module. In that respect it seems like a license clause that forbids you to run the software while the tip of your tongue is licking the back of your own neck. Still, if that matters, perhaps Debian will need to leave these modules out. Bold <ahem> users will still be able to grab them from any number of other places.

Maybe some ambitious PSF activitst could contact Roskind and Steve Kirsch and see if they know who at Disney to talk to... Or maybe the Disney guys who were at PyCon last year could help. Jeremy On Tue, 8 Feb 2005 15:37:50 -0500, Tim Peters <tim.peters@gmail.com> wrote:
[Matthias Klose]
A Debian user pointed out (http://bugs.debian.org/293932), that the current license for the Python profiler is not conforming to the DFSG (Debian free software guidelines).
http://www.python.org/doc/current/lib/node829.html states
"This permission is explicitly restricted to the copying and modification of the software to remain in Python, compiled Python, or other languages (such as C) wherein the modified or derived code is exclusively imported into a Python module." ... - Does somebody knows about the history of this license, why it is more restricted than the Python license?
Simply because that's the license Jim Roskind slapped on it when he contributed this code 10 years ago. I imagine (but don't know) that Guido looked at it, thought "hmm -- shouldn't be a problem for Python's users", and so accepted it.
- Is there a chance to change the license for these two modules (profile.py, pstats.py)?
Not unless some remnant of InfoSeek Corp can be found, since they're the copyright holder (their work, their license). Alas, Jim Roskind hasn't been seen in the Python world this century.
OTOH, if InfoSeek has vanished, it's unlikely they'll be suing anyone. Given how Python-specific profile.py and pstats.py are, it's hard for me to imagine anyone wanting to make a derivative that isn't imported into a Python module. In that respect it seems like a license clause that forbids you to run the software while the tip of your tongue is licking the back of your own neck.
Still, if that matters, perhaps Debian will need to leave these modules out. Bold <ahem> users will still be able to grab them from any number of other places. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu

Jeremy Hylton writes:
Maybe some ambitious PSF activitst could contact Roskind and Steve Kirsch and see if they know who at Disney to talk to... Or maybe the Disney guys who were at PyCon last year could help.
please could somebody give me a contact address? Matthias

>> Maybe some ambitious PSF activitst could contact Roskind and Steve >> Kirsch and see if they know who at Disney to talk to... Or maybe the >> Disney guys who were at PyCon last year could help. Matthias> please could somebody give me a contact address? Steve's easy enough to get ahold of: http://www.skirsch.com/ (He even still has a UltraSeek-powered search of his site. ;-) Search Kirsch's site for Jim Roskind returned jar@netscape.com but that was dated 31 Oct 2000. An abstract for a talk at University of Arizona in late 2003 sort of implied he was still at Netscape then ... maybe... Skip

On Tue, 8 Feb 2005 15:52:29 -0500, Jeremy Hylton <jhylton@gmail.com> wrote:
Maybe some ambitious PSF activitst could contact Roskind and Steve Kirsch and see if they know who at Disney to talk to... Or maybe the Disney guys who were at PyCon last year could help.
I contacted Jim. His response follows: --- I'm a strong supporter of Opensource software, but I'm probably not going to be able to help you very much. I could be much more helpful with understanding the code or its use ;-). To summarize what I'll say: I don't own the rights to this stuff. ... but I don't believe there are any patents that I was ever involved with that might encumber this work. I would note that my profiler code is really very rarely used in commercial products, and it is much more typically used by developers (I guess a developer toolkit, if sold, would use it). I'm pretty delighted that the code has found so much use by developers over the years. As I noted in the intro to the documentation, I had only been coding in Python for 3 weeks when I wrote it. On the positive side, it exposed many weaknesses in many developer's code (including our own at InfoSeek), as well as in core Python code (subtle bugs in the interpreter) that surely helped everyone. Even though I was a newbie, It was VERY carefully crafted,, and I'd expect that it would take a fair amount of effort to reproduce it (and that is is probably why it has not been changed much... or at least no one told me when they changed/fixed it ;-) ). With regard to why I probably can't help much..... First off, InfoSeek (holder of the copyright) was bought by Disney, and I don't know what if anything has eventually become of the tradename. There is a chance that Disney owns the rights... and I have no idea who to ask there :-/. Second, I took a look at the Copyright, and it sure seems pretty permissive. I'm amazed if folks want something more permissive. This is what I found on the web for it: Copyright © 1994, by InfoSeek Corporation, all rights reserved. Written by James Roskind.10.1 Permission to use, copy, modify, and distribute this Python software and its associated documentation for any purpose (subject to the restriction in the following sentence) without fee is hereby granted, provided that the above copyright notice appears in all copies, and that both that copyright notice and this permission notice appear in supporting documentation, and that the name of InfoSeek not be used in advertising or publicity pertaining to distribution of the software without specific, written prior permission. This permission is explicitly restricted to the copying and modification of the software to remain in Python, compiled Python, or other languages (such as C) wherein the modified or derived code is exclusively imported into a Python module. INFOSEEK CORPORATION DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL INFOSEEK CORPORATION BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. As I recall, I probably personally created the terms of the above license. I used a similar license on my C/C++ grammar, and Infoseek just added a bunch of wording to be sure that they were not at risk, and that their name would not be used in vain (or in advertising material). I think they were also interested in limiting its use to Python.... but I don't think that is a concern that would bother you. I read the link you directed me to, and its primary focus seemed ot be on patents for related or included technology. I don't believe that infoseek applied for or got any patents in this area (and certainly if they did so without my name, it would probably invalidate the patent), and I'm sure I didn't get any patents in this area at Netscape/AOL. In fact I don't think I got any patents back in 1994 or 1995. My only prior patent dated back to about 1983 (a hardware patent) that has since expired. I have some patents since (roughly) 1995, and even though I don't think any of them relate to profiling (though some did relate to languages, or more specifically, security in languages), I wouldn't want to mess with assigning rights to any of those patents, as they belong to AOL/Netscape. Here again, to my knowledge, none of my patents relate in any way to this area (profiling). Sadly, if they did, I would not have the right to assign them. I'm sure you're just doing your job, and following through by dotting all the I's and crossing all T's. My suggestion is to (as you said) work around the issue. You could always re-write the code from scratch, as the approaches are not rocket science and are pretty thoroughly explained. I wouldn't suggest it unless you are desperate. If I were you, I'd wait for a license problem to emerge (which I don't believe will ever happen). --- FWIW, I agree. Personnally, I think that if Debian has a problem with the above, it's their problem to deal with, not Python's. --david

David Ascher wrote:
FWIW, I agree. Personnally, I think that if Debian has a problem with the above, it's their problem to deal with, not Python's.
The OSI may also have a problem with the license if they were to be made aware of it. See section 8 of the Open Source Definition: """8. License Must Not Be Specific to a Product The rights attached to the program must not depend on the program's being part of a particular software distribution. If the program is extracted from that distribution and used or distributed within the terms of the program's license, all parties to whom the program is redistributed should have the same rights as those that are granted in conjunction with the original software distribution. """ I'm not entirely sure if this affects the PSF's use of OSI's trademark. IANAL. TINLA. -- Robert Kern rkern@ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter
participants (14)
-
"Martin v. Löwis"
-
A.M. Kuchling
-
Barry Warsaw
-
Bob Ippolito
-
David Ascher
-
Donovan Baarda
-
Fredrik Lundh
-
Gregory P. Smith
-
Jeremy Hylton
-
Matthias Klose
-
Nick Coghlan
-
Robert Kern
-
Skip Montanaro
-
Tim Peters