From amos@digicool.com  Wed May  2 03:02:07 2001
From: amos@digicool.com (Amos Latteier)
Date: Tue, 01 May 2001 19:02:07 -0700
Subject: [Catalog-sig] [Announce] Catalog Server Prototype Updated
Message-ID: <3AEF6A9F.4067633E@digicool.com>

Hi Guys,

I've updated my catalog server prototype.

  http://63.230.174.230:8080/archive

There have been quite a few changes. 

  * It now supports PEP 243 (HTTP POST to /archive/pep243_accept to try
it out.)

  * It now supports platform information for binary packages

  * You can now edit your packages

  * Lots of little changes (including support for undo!)

I think that the prototype is getting pretty usuable.

Enjoy! And let me know if you have any problems or suggestions.

-Amos

P.S. You may note that all packages are now listed as being uploaded by
me. That is because I had to manually rebuild the archive since this
version changed the internals so much.

--
Amos Latteier         mailto:amos@digicool.com
Digital Creations     http://www.digicool.com


From martin@loewis.home.cs.tu-berlin.de  Wed May  2 08:31:27 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Wed, 2 May 2001 09:31:27 +0200
Subject: [Catalog-sig] [Announce] Catalog Server Prototype Updated
In-Reply-To: <3AEF6A9F.4067633E@digicool.com> (message from Amos Latteier on
 Tue, 01 May 2001 19:02:07 -0700)
References: <3AEF6A9F.4067633E@digicool.com>
Message-ID: <200105020731.f427VRe02096@mira.informatik.hu-berlin.de>

>   * It now supports PEP 243 (HTTP POST to /archive/pep243_accept to try
> it out.)

This is what I'm most interested in, so I tried it first, using
swalowsupp.py (adopted to the right host, port, and relative path).
sendFile returned ('TRYAGAIN', ''). From the strace output, I could
see that the server response started with

HTTP/1.0 503 Service Unavailable\r\n
Server: Zope/(unreleased version) ZServer/1.1b1\r\n
Date: Wed, 02 May 2001 07:15:54 GMT\r\n
Bobo-Exception-File: /home/amos/Trunk/lib/python/Products/PythonCatalog/PEP243.py\r\n
Content-Type: text/html\r\n
Bobo-Exception-Type: NameError\r\n
Connection: close\r\n
Bobo-Exception-Value: <HTML><HEAD><TITLE>archive</TITLE></HEAD><BODY BGCOLOR=\#FFFFF\>   <TABLE BORDER=\0\ WIDTH=\100%\> <TR VALIGN=\TOP\>  <TD WIDTH=\10%\ ALIG=\CENTER\> <IMG SRC=\http://pdx:8080/p_/ZButton\ ALT=\Zope\> </TD>  <TD WIDTH=\90%\>   <H2>Zope Error</H2>   <P>Zop\r\n
Content-Length: 2006\r\n
Bobo-Exception-Line: 61\r\n
\r\n

Unfortunately, swalowsupp closed the connection afterwards, so I did
not get to see the 2006 bytes content.

Using another approach, I tried to run Lynx, and Netscape, on the file

<html>
<body>
        <H1>Upload file</H1>
        <FORM NAME="fileupload" METHOD="POST" ACTION="http://63.230.174.230:8080/archive/pep243_accept"
              ENCTYPE="multipart/form-data">
        <INPUT TYPE="file" NAME="distribution"><BR>
        <INPUT TYPE="text" NAME="distmd5sum"><BR>
        <INPUT TYPE="file" NAME="pkginfo"><BR>
        <INPUT TYPE="text" NAME="infomd5sum"><BR>
        <INPUT TYPE="text" NAME="platform"><BR>
        <INPUT TYPE="file" NAME="signature"><BR>
        <INPUT TYPE="hidden" NAME="protocol_version" VALUE="1"><BR>
        <INPUT TYPE="SUBMIT" VALUE="Upload">
        </FORM>
</body>
</html>

Even though I've entered a distribution and a signature, both browsers
would not include them in their HTTP request. Any idea what could be
wrong with that form?

Regards,
Martin

P.S. In case anybody wants to experiment with it, I include my
modified swalowsupp.py as well. To initiate an upload, do something like

swalowsupp.sendDist("PyXML-0.7.0.tar.gz",
  signature=open("PyXML-0.7.0.tar.gz.asc").read())

#!/usr/bin/env python

'''Routines for submission of distributions to repository server.'''

# created 2001/03/26 by Sean Reifschneider <jafo-swalow@tummy.com>

uploadHost = 'community.tummy.com'


import httplib, urllib
import time, os, sys
import md5


#####################################
#  emulate 'md5sum' command on a file

def md5sum(file):
	fp = open(file, 'rb')
	md = md5.new()
	while 1:
		data = fp.read(10240)
		if not data: break
		md.update(data)
	digest = md.digest()

	sum = reduce(lambda x,y: x + ('%02x' % y), map(ord, digest), '')
	return(sum)


#######################################################
def sendDist(fileName, pkgFile = None, platform = None,
		signature = None, userAgent = 'swalow', uploadHost = None):
	if uploadHost == None: uploadHost = '63.230.174.230'
	tmp = os.environ.get('PYTHON_MODULE_SERVER', None)
	if tmp: uploadHost = tmp

	#  get file information
	fileLen = os.stat(fileName)[6]
	fileMD5 = md5sum(fileName)

	#  create body
	boundary = '%s%.8f_%s' % ( '-' * 30, time.time(), os.uname()[1] )
	boundary = '---------------------------87109191412184106881070100800'
	body = ''
	body = body + '--%s\r\nContent-Disposition: form-data; ' \
			'name="protocol_version"\r\n\r\n1\r\n' % ( boundary, )
	if platform:
		body = body + '--%s\r\nContent-Disposition: form-data; ' \
				'name="platform"\r\n\r\n%s\r\n' % ( boundary, platform )
	if signature:
		body = body + '--%s\r\nContent-Disposition: form-data; ' \
				'name="signature"\r\n\r\n%s\r\n' % ( boundary, signature )
	body = body + '--%s\r\nContent-Disposition: form-data; ' \
			'name="distmd5sum"\r\n\r\n%s\r\n' % ( boundary, fileMD5 )
	body = body + '--%s\r\nContent-Disposition: form-data; name="distribution"' \
			'; filename="%s"\r\n\r\n' % ( boundary, os.path.basename(fileName) )
	body2 = ''
	body3 = '\r\n--%s--\r\n' % boundary

	if pkgFile != None:
		fileLen = fileLen + os.stat(pkgFile)[6]
		infoMD5 = md5sum(pkgFile)
		body2 = body2 + '--%s\r\nContent-Disposition: form-data; ' \
				'name="infomd5sum"\r\n\r\n%s\r\n' % ( boundary, infoMD5 )
		body2 = body2 + '--%s\r\nContent-Disposition: form-data; ' \
				'name="pkginfo"; filename="%s"\r\n\r\n' \
				% ( boundary, os.path.basename(fileName) )

	#  send header
	h = httplib.HTTP(uploadHost,8080)
	h.putrequest('POST', '/archive/pep243_accept')
	h.putheader('Content-length', '%d' % (len(body) + fileLen + len(body2)))
	h.putheader('Content-type', 'multipart/form-data; boundary=%s' % boundary)
	#h.putheader('User-Agent', userAgent)
	h.endheaders()

	#  send body
	h.send(body)
	fp = open(fileName, 'rb')
	while 1:
		data = fp.read(4096)
		if not data: break
		h.send(data)
	h.send(body2)

	reply, msg, hdrs = h.getreply()
	status = hdrs.get('X-Swalow-Status', 'TRYAGAIN')
	reason = hdrs.get('X-Swalow-Reason', '')

	return(status, reason)


From amos@digicool.com  Wed May  2 20:00:17 2001
From: amos@digicool.com (Amos Latteier)
Date: Wed, 02 May 2001 15:00:17 -0400
Subject: [Catalog-sig] [Announce] Catalog Server Prototype Updated
In-Reply-To: <200105020731.f427VRe02096@mira.informatik.hu-berlin.de>
Message-ID: <web-1860066@digicool.com>

On Wed, 2 May 2001 09:31:27 +0200
 "Martin v. Loewis" <martin@loewis.home.cs.tu-berlin.de>
wrote:
> >   * It now supports PEP 243 (HTTP POST to
> /archive/pep243_accept to try
> > it out.)
> 
> This is what I'm most interested in, so I tried it first,
> using
> swalowsupp.py (adopted to the right host, port, and
> relative path).

Thanks for the bug report. I'll try to get it working.

In the mean time, I copied the form from PEP 243 up the the
server:

  http://63.230.174.230:8080/pep243

It works for me.

-Amos


From amos@digicool.com  Wed May  2 23:43:08 2001
From: amos@digicool.com (Amos Latteier)
Date: Wed, 02 May 2001 18:43:08 -0400
Subject: [Catalog-sig] [Announce] Catalog Server Prototype Updated
In-Reply-To: <200105020731.f427VRe02096@mira.informatik.hu-berlin.de>
Message-ID: <web-1861021@digicool.com>

On Wed, 2 May 2001 09:31:27 +0200
 "Martin v. Loewis" <martin@loewis.home.cs.tu-berlin.de>
wrote:
> >   * It now supports PEP 243 (HTTP POST to
> /archive/pep243_accept to try
> > it out.)
> 
> This is what I'm most interested in, so I tried it first,
> using
> swalowsupp.py (adopted to the right host, port, and
> relative path).
> sendFile returned ('TRYAGAIN', '').

OK, I think that I've fixed this now. I am now able to
upload a test package with your upload script.

Let me know if you have any more problems.

-Amos



From martin@loewis.home.cs.tu-berlin.de  Wed May  2 23:47:56 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 3 May 2001 00:47:56 +0200
Subject: [Catalog-sig] [Announce] Catalog Server Prototype Updated
In-Reply-To: <web-1860066@digicool.com> (amos@digicool.com)
References: <web-1860066@digicool.com>
Message-ID: <200105022247.f42MluR01737@mira.informatik.hu-berlin.de>

> In the mean time, I copied the form from PEP 243 up the the
> server:
> 
>   http://63.230.174.230:8080/pep243
> 
> It works for me.

So it does for me; not sure what I've been doing wrong.

Is the prototype supposed to do anything with the signature, yet?  In
theory, it could reliably determine who the uploader is, provided he
had uploaded his public key before.

Regards,
Martin


From amos@digicool.com  Thu May  3 18:34:15 2001
From: amos@digicool.com (Amos Latteier)
Date: Thu, 03 May 2001 10:34:15 -0700
Subject: [Catalog-sig] [Announce] Catalog Server Prototype Updated
References: <web-1860066@digicool.com> <200105022247.f42MluR01737@mira.informatik.hu-berlin.de>
Message-ID: <3AF19697.A7A7D0FB@digicool.com>

"Martin v. Loewis" wrote:
> Is the prototype supposed to do anything with the signature, yet? 

It stores them and makes them available to downloaders. It doesn't do
any automated checking yet.

> In theory, it could reliably determine who the uploader is, provided he
> had uploaded his public key before.

Well there is a facility to upload your public key if you have an
account.

Can you tell me how to determine the uploader given a signature and a
list of public keys?

Thanks!

-Amos

--
Amos Latteier         mailto:amos@digicool.com
Digital Creations     http://www.digicool.com


From martin@loewis.home.cs.tu-berlin.de  Thu May  3 22:57:00 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 3 May 2001 23:57:00 +0200
Subject: [Catalog-sig] [Announce] Catalog Server Prototype Updated
In-Reply-To: <3AF19697.A7A7D0FB@digicool.com> (message from Amos Latteier on
 Thu, 03 May 2001 10:34:15 -0700)
References: <web-1860066@digicool.com> <200105022247.f42MluR01737@mira.informatik.hu-berlin.de> <3AF19697.A7A7D0FB@digicool.com>
Message-ID: <200105032157.f43Lv0j01408@mira.informatik.hu-berlin.de>

> Can you tell me how to determine the uploader given a signature and a
> list of public keys?

I think you first need to install all public keys in a
keyring. Assuming you use gpg, this should be done with gpg --import.
Then, given the signature and the file, you do

gpg --verify AFoo-1.0.tar.gz.asc AFoo-1.0.tar.gz

It then prints a message like

gpg: Signature made Thu May  3 23:04:07 2001 CEST using DSA key ID DC3E5D42
gpg: Good signature from "Martin v. Loewis <martin@loewis.home.cs.tu-berlin.de>"

There is also a GPG module at http://www.amk.ca/python/code/gpg.html,
which already processes the GPG output. Using the --status-fd option,
you get output that is much better parsable; in my case

[GNUPG:] SIG_ID VptwaSnFDdwDevjjAwD4bbUeWGI 2001-05-03 988923847
[GNUPG:] GOODSIG 10459BC5DC3E5D42 Martin v. Loewis <martin@loewis.home.cs.tu-berlin.de>
[GNUPG:] VALIDSIG E6ACD89306E0F05FA7653FCA10459BC5DC3E5D42 2001-05-03 988923847
[GNUPG:] TRUST_ULTIMATE                                                        

All this can be probably made to work with pgp as well, but you'd have
to figure it out yourself.

Regards,
Martin


From graeme@sofcom.com.au  Thu May 17 02:27:13 2001
From: graeme@sofcom.com.au (Graeme Matthew)
Date: Thu, 17 May 2001 11:27:13 +1000
Subject: [Catalog-sig] Help with Python Regex's
Message-ID: <000c01c0de70$8065ea00$349207cb@gatewaypc>

This is a multi-part message in MIME format.

------=_NextPart_000_0009_01C0DEC4.5202B7C0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

I am very new to python (as im a perl developer, sorry !) and am still =
trying to get my head around the regular expression model.
Python does not seem to have perls variable interpolation (or does it ?)

I have an HTML file, example below:

<HTML>
<HEAD>
<BODY>
This is a test of things<BR>
<!-- Replace Me -->
</BODY>
</HTML>

This is my template file. I want to open it and replace the <!-- Replace =
Me --> value with a new value:

Heres my code:

import re

f =3D open("C:\\www.timemanager.com\\test.html")
fileContent =3D f.read()
f.close

newVal =3D "Heres the replacement code which means it worked"
key =3D "Replace Me"

re.sub('\<!--\s*' + key + '\s*--\>',newVal,fileContent)

print fileContent

raw_input("ok done")=20

it wont work, please can someone assist in explaining where I am going =
wrong, thanks a mill





Graeme Matthew - Internet Programmer
Software Communication Group
Como Centre, 644 Chapel Street
South Yarra, Melbourne
Victoria
Mobile: 0412 806 476

This message contains privileged and confidential information intended =
only
for the use of the addressee named above.  If you are not the intended
recipient of this message you must not disseminate, copy or take any =
action
in reliance on it.  If you have received this message in error, please
notify Software Communication Group immediately.

Any views expressed in this message are those of the individual sender
except where the sender specifically states them to be the views of =
Software
Communication Group.



------=_NextPart_000_0009_01C0DEC4.5202B7C0
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content=3D"text/html; charset=3Diso-8859-1" =
http-equiv=3DContent-Type>
<META content=3D"MSHTML 5.00.3103.1000" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DTahoma size=3D2>I am very new to python (as im a perl =
developer,=20
sorry !) and am still trying to get my head around the regular =
expression=20
model.</FONT></DIV>
<DIV><FONT face=3DTahoma size=3D2>Python does not seem to have perls =
variable=20
interpolation (or does it ?)</FONT></DIV>
<DIV>&nbsp;</DIV>
<DIV><FONT face=3DTahoma size=3D2>I have an HTML file, example =
below:</FONT></DIV>
<DIV><FONT face=3DTahoma size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DTahoma=20
size=3D2>&lt;HTML&gt;<BR>&lt;HEAD&gt;<BR>&lt;BODY&gt;<BR>This is a test =
of=20
things&lt;BR&gt;<BR>&lt;!-- Replace Me=20
--&gt;<BR>&lt;/BODY&gt;<BR>&lt;/HTML&gt;</FONT></DIV>
<DIV>&nbsp;</DIV>
<DIV><FONT face=3DTahoma size=3D2>This is my template file. I want to =
open it and=20
replace the &lt;!-- Replace Me --&gt; value with a new =
value:</FONT></DIV>
<DIV>&nbsp;</DIV>
<DIV><FONT face=3DTahoma size=3D2>Heres my code:</FONT></DIV>
<DIV>&nbsp;</DIV>
<DIV><FONT face=3DTahoma size=3D2>import re</FONT></DIV>
<DIV>&nbsp;</DIV>
<DIV><FONT face=3DTahoma size=3D2>f =3D=20
open("C:\\www.timemanager.com\\test.html")<BR>fileContent =3D=20
f.read()<BR>f.close</FONT></DIV>
<DIV>&nbsp;</DIV>
<DIV><FONT face=3DTahoma size=3D2>newVal =3D "Heres the replacement code =
which means=20
it worked"<BR>key =3D "Replace Me"</FONT></DIV>
<DIV>&nbsp;</DIV>
<DIV><FONT face=3DTahoma size=3D2>re.sub('\&lt;!--\s*' + key +=20
'\s*--\&gt;',newVal,fileContent)</FONT></DIV>
<DIV>&nbsp;</DIV>
<DIV><FONT face=3DTahoma size=3D2>print fileContent</FONT></DIV>
<DIV>&nbsp;</DIV>
<DIV><FONT face=3DTahoma size=3D2>raw_input("ok done") </FONT></DIV>
<DIV><FONT face=3DTahoma size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DTahoma size=3D2>it wont work, please can someone =
assist in=20
explaining where I am going wrong, thanks a mill<BR></DIV></FONT>
<DIV>&nbsp;</DIV>
<DIV>&nbsp;</DIV>
<DIV>&nbsp;</DIV>
<DIV>&nbsp;</DIV>
<DIV><FONT face=3DTahoma size=3D2>Graeme Matthew - Internet =
Programmer<BR>Software=20
Communication Group<BR>Como Centre, 644 Chapel Street<BR>South Yarra,=20
Melbourne<BR>Victoria<BR>Mobile: 0412 806 476</FONT></DIV>
<DIV>&nbsp;</DIV>
<DIV><FONT face=3DTahoma size=3D2>This message contains privileged and =
confidential=20
information intended only<BR>for the use of the addressee named =
above.&nbsp; If=20
you are not the intended<BR>recipient of this message you must not =
disseminate,=20
copy or take any action<BR>in reliance on it.&nbsp; If you have received =
this=20
message in error, please<BR>notify Software Communication Group=20
immediately.</FONT></DIV>
<DIV>&nbsp;</DIV>
<DIV><FONT face=3DTahoma size=3D2>Any views expressed in this message =
are those of=20
the individual sender<BR>except where the sender specifically states =
them to be=20
the views of Software<BR>Communication Group.</FONT></DIV>
<DIV>&nbsp;</DIV>
<DIV><FONT face=3DTahoma size=3D2></FONT>&nbsp;</DIV></BODY></HTML>

------=_NextPart_000_0009_01C0DEC4.5202B7C0--



From guido@digicool.com  Thu May 17 04:59:56 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 16 May 2001 22:59:56 -0500
Subject: [Catalog-sig] Help with Python Regex's
In-Reply-To: Your message of "Thu, 17 May 2001 11:27:13 +1000."
 <000c01c0de70$8065ea00$349207cb@gatewaypc>
References: <000c01c0de70$8065ea00$349207cb@gatewaypc>
Message-ID: <200105170359.WAA08296@cj20424-a.reston1.va.home.com>

> import re
> 
> f = open("C:\\www.timemanager.com\\test.html")
> fileContent = f.read()
> f.close
> 
> newVal = "Heres the replacement code which means it worked"
> key = "Replace Me"
> 
> re.sub('\<!--\s*' + key + '\s*--\>',newVal,fileContent)

Try

  fileContent = re.sub(...)

instead.  Python never has side effects on simple variables!

> print fileContent
> 
> raw_input("ok done")

--Guido van Rossum (home page: http://www.python.org/~guido/)


From luisleonellopez@hotmail.com  Sun May 27 23:48:51 2001
From: luisleonellopez@hotmail.com (Luis Leonel Lopez)
Date: Sun, 27 May 2001 22:48:51 -0000
Subject: [Catalog-sig] Questions about efficiency.
Message-ID: <F168FsTmIClmpcQABZT0000d6ad@hotmail.com>

Dear friends,

I wrote two functions which receive a sequence and return a list with 
non-duplicate elements. As per "unique" function's Tim Peters (as is in 
Python Cookbook:

def unique(s):
    """Return a list of the elements in s, but without duplicates.

    For example, unique([1,2,3,1,2,3]) is some permutation of [1,2,3],
    unique("abcabc") some permutation of ["a", "b", "c"], and
    unique(([1, 2], [2, 3], [1, 2])) some permutation of
    [[2, 3], [1, 2]].

    For best speed, all sequence elements should be hashable.  Then
    unique() will usually work in linear time.

    If not possible, the sequence elements should enjoy a total
    ordering, and if list(s).sort() doesn't raise TypeError it's
    assumed that they do enjoy a total ordering.  Then unique() will
    usually work in O(N*log2(N)) time.

    If that's not possible either, the sequence elements must support
    equality-testing.  Then unique() will usually work in quadratic
    time.
    """

    n = len(s)
    if n == 0:
        return []

    # Try using a dict first, as that's the fastest and will usually
    # work.  If it doesn't work, it will usually fail quickly, so it
    # usually doesn't cost much to *try* it.  It requires that all the
    # sequence elements be hashable, and support equality comparison.
    u = {}
    try:
        for x in s:
            u[x] = 1
    except TypeError:
        del u  # move on to the next method
    else:
        return u.keys()

    # We can't hash all the elements.  Second fastest is to sort,
    # which brings the equal elements together; then duplicates are
    # easy to weed out in a single pass.
    # NOTE:  Python's list.sort() was designed to be efficient in the
    # presence of many duplicate elements.  This isn't true of all
    # sort functions in all languages or libraries, so this approach
    # is more effective in Python than it may be elsewhere.
    try:
        t = list(s)
        t.sort()
    except TypeError:
        del t  # move on to the next method
    else:
        assert n > 0
        last = t[0]
        lasti = i = 1
        while i < n:
            if t[i] != last:
                t[lasti] = last = t[i]
                lasti += 1
            i += 1
        return t[:lasti]

    # Brute force is all that's left.
    u = []
    for x in s:
        if x not in u:
            u.append(x)
    return u

), my functions use the slowest method by brute force. I'd want to know why 
and which of my functions is better or efficient and why.

def onlyOne1(s)
	r = []
	for x in s:
		if x not in r:
			r.append(x)
	return r

def onlyOne2(s):
	r = []
	for x in s:
		try:
			r.index(x)
		except:
			r.append(x)
	return r

Thank you in advance!

Luis Leonel Lopez
_________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.



From martin@loewis.home.cs.tu-berlin.de  Mon May 28 00:23:55 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Mon, 28 May 2001 01:23:55 +0200
Subject: [Catalog-sig] Questions about efficiency.
In-Reply-To: <F168FsTmIClmpcQABZT0000d6ad@hotmail.com>
 (luisleonellopez@hotmail.com)
References: <F168FsTmIClmpcQABZT0000d6ad@hotmail.com>
Message-ID: <200105272323.f4RNNtH01352@mira.informatik.hu-berlin.de>

> I wrote two functions which receive a sequence and return a list with 
> non-duplicate elements. As per "unique" function's Tim Peters (as is in 
> Python Cookbook:

Hi Luis,

catalog-sig is probably not the right place to ask this question. What
made you send it here? (this is a serious question, because of all the
SIG lists I read, catalog-sig gets most off-topic questions)

In any case, sending your question to python-list@python.org would
have been better.

> my functions use the slowest method by brute force. I'd want to know why 
> and which of my functions is better or efficient and why.
> 
> def onlyOne1(s)
> 	r = []
> 	for x in s:
> 		if x not in r:
> 			r.append(x)
> 	return r
> 
> def onlyOne2(s):
> 	r = []
> 	for x in s:
> 		try:
> 			r.index(x)
> 		except:
> 			r.append(x)
> 	return r

The difference between these functions is minimal. Version 2 is
probably slightly slower, since it has to do exception handling in
case the object is not found in r, plus it does one more function
call.

However, the differences in timing should really be minor, compared to
the typical approach, which is to assume that all elements can be
hashed.

Regards,
Martin



From Mike.Olson@fourthought.com  Mon May 28 21:38:22 2001
From: Mike.Olson@fourthought.com (Mike Olson)
Date: Mon, 28 May 2001 14:38:22 -0600
Subject: [Catalog-sig] Questions about efficiency.
References: <F168FsTmIClmpcQABZT0000d6ad@hotmail.com>
Message-ID: <3B12B73E.3E716146@FourThought.com>

Luis Leonel Lopez wrote:
> 
> Dear friends,
> 
> I wrote two functions which receive a sequence and return a list with
> non-duplicate elements. As per "unique" function's Tim Peters (as is in
> Python Cookbook:




Here is the unique function I wrote.  I haven't really analized it much
but it seems to be pretty fast.  Atleast it is faster then the for loop
approaches for large sets

def Unique(left):
    return reduce(lambda rt,x:x in rt and rt or rt + [x],left,[])
 

Mike


> 
> def unique(s):
>     """Return a list of the elements in s, but without duplicates.
> 
>     For example, unique([1,2,3,1,2,3]) is some permutation of [1,2,3],
>     unique("abcabc") some permutation of ["a", "b", "c"], and
>     unique(([1, 2], [2, 3], [1, 2])) some permutation of
>     [[2, 3], [1, 2]].
> 
>     For best speed, all sequence elements should be hashable.  Then
>     unique() will usually work in linear time.
> 
>     If not possible, the sequence elements should enjoy a total
>     ordering, and if list(s).sort() doesn't raise TypeError it's
>     assumed that they do enjoy a total ordering.  Then unique() will
>     usually work in O(N*log2(N)) time.
> 
>     If that's not possible either, the sequence elements must support
>     equality-testing.  Then unique() will usually work in quadratic
>     time.
>     """
> 
>     n = len(s)
>     if n == 0:
>         return []
> 
>     # Try using a dict first, as that's the fastest and will usually
>     # work.  If it doesn't work, it will usually fail quickly, so it
>     # usually doesn't cost much to *try* it.  It requires that all the
>     # sequence elements be hashable, and support equality comparison.
>     u = {}
>     try:
>         for x in s:
>             u[x] = 1
>     except TypeError:
>         del u  # move on to the next method
>     else:
>         return u.keys()
> 
>     # We can't hash all the elements.  Second fastest is to sort,
>     # which brings the equal elements together; then duplicates are
>     # easy to weed out in a single pass.
>     # NOTE:  Python's list.sort() was designed to be efficient in the
>     # presence of many duplicate elements.  This isn't true of all
>     # sort functions in all languages or libraries, so this approach
>     # is more effective in Python than it may be elsewhere.
>     try:
>         t = list(s)
>         t.sort()
>     except TypeError:
>         del t  # move on to the next method
>     else:
>         assert n > 0
>         last = t[0]
>         lasti = i = 1
>         while i < n:
>             if t[i] != last:
>                 t[lasti] = last = t[i]
>                 lasti += 1
>             i += 1
>         return t[:lasti]
> 
>     # Brute force is all that's left.
>     u = []
>     for x in s:
>         if x not in u:
>             u.append(x)
>     return u
> 
> ), my functions use the slowest method by brute force. I'd want to know why
> and which of my functions is better or efficient and why.
> 
> def onlyOne1(s)
>         r = []
>         for x in s:
>                 if x not in r:
>                         r.append(x)
>         return r
> 
> def onlyOne2(s):
>         r = []
>         for x in s:
>                 try:
>                         r.index(x)
>                 except:
>                         r.append(x)
>         return r
> 
> Thank you in advance!
> 
> Luis Leonel Lopez
> _________________________________________________________________________
> Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.
> 
> _______________________________________________
> Catalog-sig mailing list
> Catalog-sig@python.org
> http://mail.python.org/mailman/listinfo/catalog-sig

-- 
Mike Olson				 Principal Consultant
mike.olson@fourthought.com               (303)583-9900 x 102
Fourthought, Inc.                         http://Fourthought.com 
Software-engineering, knowledge-management, XML, CORBA, Linux, Python