From jodok at lovelysystems.com  Mon Jul  2 21:33:38 2007
From: jodok at lovelysystems.com (Jodok Batlogg)
Date: Mon, 2 Jul 2007 21:33:38 +0200
Subject: [Catalog-sig] ip 194.183.146.189 blocked
Message-ID: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>

hi,

is it possible that our outgoing proxy server is beeing blocked by  
cheeseshop? it's ip address is 194.183.146.189
no, it was no attack to cheeseshop :) we're simply running buildout  
over and over and probably generating some load.

thanks

jodok

--
"Explicit is better than implicit."
   -- The Zen of Python, by Tim Peters

Jodok Batlogg, Lovely Systems
Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria
phone: +43 5572 908060, fax: +43 5572 908060-77


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2454 bytes
Desc: not available
Url : http://mail.python.org/pipermail/catalog-sig/attachments/20070702/69790938/attachment.bin 

From fdrake at gmail.com  Mon Jul  2 23:21:06 2007
From: fdrake at gmail.com (Fred Drake)
Date: Mon, 2 Jul 2007 17:21:06 -0400
Subject: [Catalog-sig] ip 194.183.146.189 blocked
In-Reply-To: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>
References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>
Message-ID: <9cee7ab80707021421v4c30a348g9bd62272d81b2413@mail.gmail.com>

On 7/2/07, Jodok Batlogg <jodok at lovelysystems.com> wrote:
> is it possible that our outgoing proxy server is beeing blocked by
> cheeseshop? it's ip address is 194.183.146.189
> no, it was no attack to cheeseshop :) we're simply running buildout
> over and over and probably generating some load.

Hey Jodok,

I've taken to only using an internal repository for project buildouts;
if I need/want a new release from PyPI, I load that into the internal
repository.  That avoids depending on PyPI being accessible at all
times, and I can always get what I've used again.  No need to worry
about someone hiding old releases, or whatever.

It in incurs a little overhead on adding or updating a package used in
my projects, but avoids depending on a highly-variable service.  An
internal repository can still have problems, but at least it's easier
to make changes if needed.


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at gmail.com>
"Chaos is the score upon which reality is written." --Henry Miller

From jodok at lovelysystems.com  Mon Jul  2 23:25:33 2007
From: jodok at lovelysystems.com (Jodok Batlogg)
Date: Mon, 2 Jul 2007 23:25:33 +0200
Subject: [Catalog-sig] ip 194.183.146.189 blocked
In-Reply-To: <9cee7ab80707021421v4c30a348g9bd62272d81b2413@mail.gmail.com>
References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>
	<9cee7ab80707021421v4c30a348g9bd62272d81b2413@mail.gmail.com>
Message-ID: <003AB009-74C9-4F3A-8C78-F9CA96B31605@lovelysystems.com>


On 02.07.2007, at 23:21, Fred Drake wrote:

> On 7/2/07, Jodok Batlogg <jodok at lovelysystems.com> wrote:
>> is it possible that our outgoing proxy server is beeing blocked by
>> cheeseshop? it's ip address is 194.183.146.189
>> no, it was no attack to cheeseshop :) we're simply running buildout
>> over and over and probably generating some load.
>
> Hey Jodok,
>
> I've taken to only using an internal repository for project buildouts;
> if I need/want a new release from PyPI, I load that into the internal
> repository.  That avoids depending on PyPI being accessible at all
> times, and I can always get what I've used again.  No need to worry
> about someone hiding old releases, or whatever.
>
> It in incurs a little overhead on adding or updating a package used in
> my projects, but avoids depending on a highly-variable service.  An
> internal repository can still have problems, but at least it's easier
> to make changes if needed.

already done after pypi beeing flakey :)
unfortunately now the outgoing ip of this repo is beeing blocked and  
it sucks to scp downloaded files :)

thanks fred,

jodok

>
>
>  -Fred
>
> -- 
> Fred L. Drake, Jr.    <fdrake at gmail.com>
> "Chaos is the score upon which reality is written." --Henry Miller

--
"Errors should never pass silently."
"Unless explicitly silenced."
   -- The Zen of Python, by Tim Peters

Jodok Batlogg, Lovely Systems
Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria
phone: +43 5572 908060, fax: +43 5572 908060-77


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2454 bytes
Desc: not available
Url : http://mail.python.org/pipermail/catalog-sig/attachments/20070702/9efc7cec/attachment.bin 

From jim at zope.com  Tue Jul  3 00:04:36 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 2 Jul 2007 18:04:36 -0400
Subject: [Catalog-sig] ip 194.183.146.189 blocked
In-Reply-To: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>
References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>
Message-ID: <1BBE8714-5AC2-40E0-9182-F628B58F4911@zope.com>


On Jul 2, 2007, at 3:33 PM, Jodok Batlogg wrote:

> hi,
>
> is it possible that our outgoing proxy server is beeing blocked by  
> cheeseshop? it's ip address is 194.183.146.189
> no, it was no attack to cheeseshop :) we're simply running buildout  
> over and over and probably generating some load.

It's hard to believe that buildout could be generating enough load to  
trigger being blocked.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From lac at openend.se  Tue Jul  3 00:16:38 2007
From: lac at openend.se (Laura Creighton)
Date: Tue, 03 Jul 2007 00:16:38 +0200
Subject: [Catalog-sig] ip 194.183.146.189 blocked
In-Reply-To: Message from Jodok Batlogg <jodok@lovelysystems.com> of "Mon,
	02 Jul 2007 21:33:38 +0200."
	<8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com> 
References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com> 
Message-ID: <200707022216.l62MGcg6009085@theraft.openend.se>

Could it be that you are simply out of apache's?  i recall that
Sean set the number of simultaneous ones at some very tiny number.

Laura

From martin at v.loewis.de  Tue Jul  3 00:29:21 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 03 Jul 2007 00:29:21 +0200
Subject: [Catalog-sig] ip 194.183.146.189 blocked
In-Reply-To: <200707022216.l62MGcg6009085@theraft.openend.se>
References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>
	<200707022216.l62MGcg6009085@theraft.openend.se>
Message-ID: <46897C41.7090609@v.loewis.de>

Laura Creighton schrieb:
> Could it be that you are simply out of apache's?  i recall that
> Sean set the number of simultaneous ones at some very tiny number.

I think you misunderstood. He set MaxRequestsPerChild to 10, which
means that each process will be replaced by a different one after
10 requests. MaxClients is 60, which should be more than enough.

Regards,
Martin

From lac at openend.se  Tue Jul  3 00:33:36 2007
From: lac at openend.se (Laura Creighton)
Date: Tue, 03 Jul 2007 00:33:36 +0200
Subject: [Catalog-sig] ip 194.183.146.189 blocked
In-Reply-To: Message from =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=
	<martin@v.loewis.de> 
	of "Tue, 03 Jul 2007 00:29:21 +0200." <46897C41.7090609@v.loewis.de> 
References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>
	<200707022216.l62MGcg6009085@theraft.openend.se>
	<46897C41.7090609@v.loewis.de> 
Message-ID: <200707022233.l62MXavr011774@theraft.openend.se>

In a message of Tue, 03 Jul 2007 00:29:21 +0200, "Martin v. L?wis" writes:
>Laura Creighton schrieb:
>> Could it be that you are simply out of apache's?  i recall that
>> Sean set the number of simultaneous ones at some very tiny number.
>
>I think you misunderstood. He set MaxRequestsPerChild to 10, which
>means that each process will be replaced by a different one after
>10 requests. MaxClients is 60, which should be more than enough.
>
>Regards,
>Martin

yes, I thought it was 10.  Sorry about that, and thank you.

Laura


From martin at v.loewis.de  Tue Jul  3 09:22:11 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 03 Jul 2007 09:22:11 +0200
Subject: [Catalog-sig] ip 194.183.146.189 blocked
In-Reply-To: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>
References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>
Message-ID: <4689F923.8030304@v.loewis.de>

> is it possible that our outgoing proxy server is beeing blocked by
> cheeseshop? it's ip address is 194.183.146.189

I can't see anything like that in the configuration of ximinez.

Furthermore, I cannot see that this IP addresses made any attempt
to contact ximinez. I got several accesses from 194.183.146.178,
for various versions of zc.buildout, through setuptools, and
I got requests from 194.183.146.185 through Firefox, but none
from the IP address that you mention. Going back until December
2006 (if I can trust the logs), that machine never made any
access to the Cheeseshop.

Regards,
Martin

From jodok at lovelysystems.com  Tue Jul  3 11:02:19 2007
From: jodok at lovelysystems.com (Jodok Batlogg)
Date: Tue, 3 Jul 2007 11:02:19 +0200
Subject: [Catalog-sig] ip 194.183.146.189 blocked
In-Reply-To: <4689F923.8030304@v.loewis.de>
References: <8F1F0605-B424-4597-BADF-1496BDBFC2C1@lovelysystems.com>
	<4689F923.8030304@v.loewis.de>
Message-ID: <5B0A8BC7-CC65-49E6-AA15-CCF591A0EA41@lovelysystems.com>

On 03.07.2007, at 09:22, Martin v. L?wis wrote:

>> is it possible that our outgoing proxy server is beeing blocked by
>> cheeseshop? it's ip address is 194.183.146.189
>
> I can't see anything like that in the configuration of ximinez.
>
> Furthermore, I cannot see that this IP addresses made any attempt
> to contact ximinez. I got several accesses from 194.183.146.178,
> for various versions of zc.buildout, through setuptools, and
> I got requests from 194.183.146.185 through Firefox, but none
> from the IP address that you mention. Going back until December
> 2006 (if I can trust the logs), that machine never made any
> access to the Cheeseshop.

it seems to happen on the network level. i can't ping the machine  
from this ip address :)

coming from  194.183.146.189:

traceroute to ximinez.python.org (82.94.237.219), 64 hops max, 60  
byte packets
  1  lsfw01 (192.168.34.254)  0.727 ms  0.406 ms  0.345 ms
  2  194-183-146-177.tele.net (194.183.146.177)  1.212 ms  1.061 ms   
3.801 ms
  3  cr4-swz1.net.tele.net (194.183.134.8)  6.733 ms  5.034 ms  4.472 ms
  4  fas0-1-70-cr3-swz1.net.tele.net (194.183.133.188)  4.550 ms   
4.581 ms  4.627 ms
  5  atm0-0-r1-hoe1.net.tele.net (194.183.135.34)  5.743 ms  5.471  
ms  5.362 ms
  6  giga0-2.r2-buh1.net.tele.net (194.183.135.194)  7.449 ms  6.484  
ms  5.843 ms
  7  83.144.194.17 (83.144.194.17)  8.407 ms  8.736 ms  8.444 ms
  8  g4-0-211.core01.zrh01.atlas.cogentco.com (149.6.83.129)  9.269  
ms  8.669 ms  8.727 ms
  9  p6-0.core01.str01.atlas.cogentco.com (130.117.0.53)  11.924 ms   
11.825 ms  10.960 ms
10  p3-0.core01.fra03.atlas.cogentco.com (130.117.0.217)  13.820 ms   
14.551 ms  13.941 ms
11  p3-0.core01.ams03.atlas.cogentco.com (130.117.0.145)  21.411 ms   
21.266 ms  20.842 ms
12  t3-1.mpd01.ams03.atlas.cogentco.com (130.117.0.34)  20.100 ms   
21.003 ms  20.880 ms
13  ams-ix.sara.xs4all.net (195.69.144.48)  20.878 ms  20.983 ms   
28.193 ms
14  0.so-6-0-0.xr1.3d12.xs4all.net (194.109.5.1)  21.045 ms  21.486  
ms  20.892 ms
15  0.so-3-0-0.cr1.3d12.xs4all.net (194.109.5.58)  49.436 ms  29.076  
ms  103.199 ms
16  * * *
17  * * *
18  * * *


coming from 194.183.146.179:

traceroute to ximinez.python.org (82.94.237.219), 64 hops max, 60  
byte packets
  1  lsfw01 (192.168.34.254)  2.030 ms  1.495 ms  1.461 ms
  2  * 194-183-146-177.tele.net (194.183.146.177)  1.834 ms  1.646 ms
  3  cr4-swz1.net.tele.net (194.183.134.8)  4.873 ms  6.393 ms  5.318 ms
  4  fas4-0-70-cr1-swz1.net.tele.net (194.183.133.190)  8.466 ms   
196.174 ms  5.562 ms
  5  194.183.142.2 (194.183.142.2)  6.540 ms  6.462 ms  21.969 ms
  6  giga0-2.r2-buh1.net.tele.net (194.183.135.194)  6.642 ms  6.871  
ms  7.797 ms
  7  83.144.194.17 (83.144.194.17)  18.965 ms  9.923 ms  10.459 ms
  8  g4-0-211.core01.zrh01.atlas.cogentco.com (149.6.83.129)  10.003  
ms  9.462 ms  9.945 ms
  9  p6-0.core01.str01.atlas.cogentco.com (130.117.0.53)  13.728 ms   
11.831 ms  12.375 ms
10  p3-0.core01.fra03.atlas.cogentco.com (130.117.0.217)  14.568 ms   
16.176 ms  15.069 ms
11  p3-0.core01.ams03.atlas.cogentco.com (130.117.0.145)  124.421 ms   
134.435 ms  205.047 ms
12  t3-1.mpd01.ams03.atlas.cogentco.com (130.117.0.34)  21.689 ms   
21.962 ms  22.313 ms
13  ams-ix.tc2.xs4all.net (195.69.144.166)  21.655 ms  21.213 ms   
23.011 ms
14  0.so-7-0-0.xr2.3d12.xs4all.net (194.109.5.13)  21.531 ms  21.966  
ms 0.so-7-0-0.xr1.3d12.xs4all.net (194.109.5.9)  21.673 ms
15  0.so-2-0-0.cr1.3d12.xs4all.net (194.109.5.74)  21.526 ms  
0.so-3-0-0.cr1.3d12.xs4all.net (194.109.5.58)  24.606 ms  22.263 ms
16  ximinez.python.org (82.94.237.219)  23.363 ms  21.890 ms  25.506 ms

thanks a lot for your help

jodok

>
> Regards,
> Martin

--
"Simple is better than complex."
   -- The Zen of Python, by Tim Peters

Jodok Batlogg, Lovely Systems
Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria
phone: +43 5572 908060, fax: +43 5572 908060-77


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2454 bytes
Desc: not available
Url : http://mail.python.org/pipermail/catalog-sig/attachments/20070703/b6ff4b4e/attachment.bin 

From pje at telecommunity.com  Thu Jul  5 02:56:25 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 04 Jul 2007 20:56:25 -0400
Subject: [Catalog-sig] Cheeseshop login problems?
Message-ID: <20070705005415.3F4F03A4046@sparrow.telecommunity.com>

I can't seem to log in to the Cheeseshop, from any platform or 
machine, whether via script or browser (Firefox or Lynx).  I haven't 
changed my password, but just in case there was an issue with my 
password, I asked for a password reset.

The passwords I received in email don't work either, however, which 
seems to suggest that there is a server problem involved.  :(


From richardjones at optusnet.com.au  Thu Jul  5 05:43:44 2007
From: richardjones at optusnet.com.au (richardjones at optusnet.com.au)
Date: Thu, 05 Jul 2007 13:43:44 +1000
Subject: [Catalog-sig] Cheeseshop login problems?
Message-ID: <200707050343.l653hirE007904@mail06.syd.optusnet.com.au>

An embedded and charset-unspecified text was scrubbed...
Name: not available
Url: http://mail.python.org/pipermail/catalog-sig/attachments/20070705/a90ec30e/attachment.asc 

From martin at v.loewis.de  Thu Jul  5 07:38:36 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 05 Jul 2007 07:38:36 +0200
Subject: [Catalog-sig] Cheeseshop login problems?
In-Reply-To: <200707050343.l653hirE007904@mail06.syd.optusnet.com.au>
References: <200707050343.l653hirE007904@mail06.syd.optusnet.com.au>
Message-ID: <468C83DC.4030605@v.loewis.de>

> No logins appear to work at the moment.
> 
> Has anyone made changes to the apache config recently?

I did - I'll look into it.

Martin

From martin at v.loewis.de  Thu Jul  5 08:22:33 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 05 Jul 2007 08:22:33 +0200
Subject: [Catalog-sig] Cheeseshop login problems?
In-Reply-To: <20070705005415.3F4F03A4046@sparrow.telecommunity.com>
References: <20070705005415.3F4F03A4046@sparrow.telecommunity.com>
Message-ID: <468C8E29.70808@v.loewis.de>

Phillip J. Eby schrieb:
> I can't seem to log in to the Cheeseshop, from any platform or 
> machine, whether via script or browser (Firefox or Lynx).  I haven't 
> changed my password, but just in case there was an issue with my 
> password, I asked for a password reset.
> 
> The passwords I received in email don't work either, however, which 
> seems to suggest that there is a server problem involved.  :(

Please try again; it should work now.

I switched the Cheeseshop from using mod_python to using FastCGI,
but forgot to do the RewriteCond dance. Sorry about that.

Regards,
Martin

From martin at v.loewis.de  Thu Jul  5 08:37:35 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 05 Jul 2007 08:37:35 +0200
Subject: [Catalog-sig] Cheeseshop performance problems solved
Message-ID: <468C91AF.8000304@v.loewis.de>

I think I solved the performance problems of the Cheeseshop,
by switching both the wiki and the Cheeseshop it to FastCGI.
I raised the MaxRequestsPerChild to 1000 again, and
MaxClients back to its default (256). There are four processes
running the PyPI, and four threads running MoinMoin.

If you experience problems, please report exact data and
time of the outage, as well as the nature of the outage
(e.g. if it doesn't respond within a reasonable time,
report what operation you did and after what time you
gave up waiting for a response).

Regards,
Martin

From jim at zope.com  Thu Jul  5 15:32:44 2007
From: jim at zope.com (Jim Fulton)
Date: Thu, 5 Jul 2007 09:32:44 -0400
Subject: [Catalog-sig] Cheeseshop login problems?
In-Reply-To: <468C8E29.70808@v.loewis.de>
References: <20070705005415.3F4F03A4046@sparrow.telecommunity.com>
	<468C8E29.70808@v.loewis.de>
Message-ID: <24CECA6B-B9F3-420B-8016-C3C4FBB06548@zope.com>


Hey Martin,

I want to say thanks to you and the other folks who are working on  
trying to address the PyPI performance issues.  Much much thanks!

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Fri Jul  6 01:29:57 2007
From: jim at zope.com (Jim Fulton)
Date: Thu, 5 Jul 2007 19:29:57 -0400
Subject: [Catalog-sig] psycoph errors from pypi
Message-ID: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>


I imagine the people working on the cheeseshop are aware of this,  
but, in case you aren't, I'm getting intermittent  errors from the  
cheeseshop.  For example, requests for:  http://www.python.org/pypi/  
Often give:

Error...

There's been a problem with your request

psycopg.ProgrammingError: ERROR:  current transaction is aborted,  
commands ignored until end of transaction block

select name, version, summary, _pypi_ordering
             from releases  where (lower(name) LIKE '%%%%') and  
_pypi_hidden = FALSE
             order by lower(name), _pypi_ordering

or http://www.python.org/pypi/setuptools sometimes gives:

Error...

There's been a problem with your request

psycopg.ProgrammingError: ERROR:  current transaction is aborted,  
commands ignored until end of transaction block


             select name, version, summary, _pypi_hidden
             from releases
             where name = 'setuptools' and _pypi_hidden = False
             order by _pypi_ordering desc

http://www.python.org/pypi/setuptools/0.6c6 gives:

Error...

There's been a problem with your request

psycopg.ProgrammingError: ERROR:  current transaction is aborted,  
commands ignored until end of transaction block

select packages.name as name, stable_version, version, author,
                   author_email, maintainer, maintainer_email,  
home_page,
                   license, summary, description, description_html,  
keywords,
                   platform, download_url, _pypi_ordering, _pypi_hidden,
                   cheesecake_installability_id,
                   cheesecake_documentation_id,
                   cheesecake_code_kwalitee_id
                  from packages, releases
                  where packages.name='setuptools' and version='0.6c6'
                   and packages.name = releases.name

And so on.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From martin at v.loewis.de  Fri Jul  6 04:10:19 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 06 Jul 2007 04:10:19 +0200
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>
Message-ID: <468DA48B.2020008@v.loewis.de>

Jim Fulton schrieb:
> I imagine the people working on the cheeseshop are aware of this,  
> but, in case you aren't, I'm getting intermittent  errors from the  
> cheeseshop.  For example, requests for:  http://www.python.org/pypi/  

I wasn't aware of this until you reported it.

I don't have a clue what's causing it.

Regards,
Martin

From martin at v.loewis.de  Fri Jul  6 04:33:54 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 06 Jul 2007 04:33:54 +0200
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To: <468DA48B.2020008@v.loewis.de>
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>
	<468DA48B.2020008@v.loewis.de>
Message-ID: <468DAA12.4000707@v.loewis.de>

Martin v. L?wis schrieb:
> Jim Fulton schrieb:
>> I imagine the people working on the cheeseshop are aware of this,  
>> but, in case you aren't, I'm getting intermittent  errors from the  
>> cheeseshop.  For example, requests for:  http://www.python.org/pypi/  
> 
> I wasn't aware of this until you reported it.
> 
> I don't have a clue what's causing it.

I now do, somewhat. Apparently, when you discard a cursor object
in psycopg, and create a new one, that doesn't necessarily start
a new transaction. So if there was some SQL error in the connection,
it stops accepting further SQL statements.

I fixed that by rolling back the connection after each request,
and before each new request.

What I don't understand is why there was an error in the first
place (or what that error was).

Regards,
Martin

From jim at zope.com  Fri Jul  6 14:04:06 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 6 Jul 2007 08:04:06 -0400
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To: <468DAA12.4000707@v.loewis.de>
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>
	<468DA48B.2020008@v.loewis.de> <468DAA12.4000707@v.loewis.de>
Message-ID: <A4A957B4-2045-416C-9B71-2FDF6DE52680@zope.com>


On Jul 5, 2007, at 10:33 PM, Martin v. L?wis wrote:

> Martin v. L?wis schrieb:
>> Jim Fulton schrieb:
>>> I imagine the people working on the cheeseshop are aware of this,
>>> but, in case you aren't, I'm getting intermittent  errors from the
>>> cheeseshop.  For example, requests for:  http://www.python.org/pypi/
>>
>> I wasn't aware of this until you reported it.
>>
>> I don't have a clue what's causing it.
>
> I now do, somewhat. Apparently, when you discard a cursor object
> in psycopg, and create a new one, that doesn't necessarily start
> a new transaction. So if there was some SQL error in the connection,
> it stops accepting further SQL statements.
>
> I fixed that by rolling back the connection after each request,
> and before each new request.
>
> What I don't understand is why there was an error in the first
> place (or what that error was).

OK, this probably isn't helpful, but I can't help asking an obvious  
question.  Did something change in the software other than a switch  
from mod_python to FastCGI?

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Fri Jul  6 14:16:47 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 6 Jul 2007 08:16:47 -0400
Subject: [Catalog-sig] Cheeseshop performance improved
In-Reply-To: <20070626105201.GA14025@tummy.com>
References: <467CC2E1.3010708@v.loewis.de>
	<E7852F4F-9599-449F-822F-C7F33783DF9F@zope.com>
	<DCC2A626-BCA9-4CC8-9226-77FA12CE86F3@zope.com>
	<46801FDC.4060502@v.loewis.de>
	<65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com>
	<46802A10.8080205@v.loewis.de>
	<A4C9B0C7-1420-4398-B85A-C601C18204C6@zope.com>
	<200706252144.l5PLi7cs032424@theraft.openend.se>
	<20070626105201.GA14025@tummy.com>
Message-ID: <F7A9A3D2-7F14-4CEC-ABA4-8C3DF87F4D43@zope.com>


On Jun 26, 2007, at 6:52 AM, Sean Reifschneider wrote:
...
> The quick fix would be to engage XS4ALL to upgrade the RAM in that  
> box,
> leaving the box otherwise untouched.  The system has only 1GB of  
> RAM in it.
> It's got a 2.8GHz Xeon CPU in it, so I would expect it can take at  
> least
> 4GB of RAM, if not 8 or 16GB.
>
> Thomas: If the PSF threw a grand or two at XS4ALL, could we get the  
> memory
> in ximinez upgraded?  Preferably to 4 or 8GB of RAM?

What is the status if this?  This seems like a promising early step  
and a pretty darn good use of PSF funds.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From pje at telecommunity.com  Fri Jul  6 19:21:00 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 06 Jul 2007 13:21:00 -0400
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To: <A4A957B4-2045-416C-9B71-2FDF6DE52680@zope.com>
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>
	<468DA48B.2020008@v.loewis.de> <468DAA12.4000707@v.loewis.de>
	<A4A957B4-2045-416C-9B71-2FDF6DE52680@zope.com>
Message-ID: <20070706171848.268C23A4046@sparrow.telecommunity.com>

At 08:04 AM 7/6/2007 -0400, Jim Fulton wrote:
>On Jul 5, 2007, at 10:33 PM, Martin v. L?wis wrote:
> > I now do, somewhat. Apparently, when you discard a cursor object
> > in psycopg, and create a new one, that doesn't necessarily start
> > a new transaction. So if there was some SQL error in the connection,
> > it stops accepting further SQL statements.
> >
> > I fixed that by rolling back the connection after each request,
> > and before each new request.
> >
> > What I don't understand is why there was an error in the first
> > place (or what that error was).
>
>OK, this probably isn't helpful, but I can't help asking an obvious
>question.  Did something change in the software other than a switch
>from mod_python to FastCGI?

That wouldn't be necessary for this to become a problem.  If PyPI was 
CGI before, then any sort of transient SQL problem wouldn't have had 
this effect, because the DB connection would've been closed at the 
end of each request.  So, it's probably an existing SQL error in PyPI.


From jafo at tummy.com  Fri Jul  6 23:45:27 2007
From: jafo at tummy.com (Sean Reifschneider)
Date: Fri, 6 Jul 2007 15:45:27 -0600
Subject: [Catalog-sig] Cheeseshop performance improved
In-Reply-To: <F7A9A3D2-7F14-4CEC-ABA4-8C3DF87F4D43@zope.com>
References: <467CC2E1.3010708@v.loewis.de>
	<E7852F4F-9599-449F-822F-C7F33783DF9F@zope.com>
	<DCC2A626-BCA9-4CC8-9226-77FA12CE86F3@zope.com>
	<46801FDC.4060502@v.loewis.de>
	<65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com>
	<46802A10.8080205@v.loewis.de>
	<A4C9B0C7-1420-4398-B85A-C601C18204C6@zope.com>
	<200706252144.l5PLi7cs032424@theraft.openend.se>
	<20070626105201.GA14025@tummy.com>
	<F7A9A3D2-7F14-4CEC-ABA4-8C3DF87F4D43@zope.com>
Message-ID: <20070706214527.GR28082@tummy.com>

On Fri, Jul 06, 2007 at 08:16:47AM -0400, Jim Fulton wrote:
>What is the status if this?  This seems like a promising early step  
>and a pretty darn good use of PSF funds.

I never heard anything from Thomas, which I would think would be the right
person to run this through, as I really don't know anything about the
arrangement we have with XS4ALL.  I guess we'd also need to get the PSF to
approve this, though I'd imagine that'd be little more than a formality.

If we don't have any response from Thomas in a bit, I can try contacting
XS4ALL directly and see if they can give us any ideas.

However, I believe that Martin also thinks that with his FastCGi changes it
should be happy now as is...

Thanks,
Sean
-- 
 I think you are blind to the fact that the hand you hold
 is the hand that holds you down.  -- Everclear
Sean Reifschneider, Member of Technical Staff <jafo at tummy.com>
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability


From martin at v.loewis.de  Sat Jul  7 00:02:47 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 07 Jul 2007 00:02:47 +0200
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To: <A4A957B4-2045-416C-9B71-2FDF6DE52680@zope.com>
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>
	<468DA48B.2020008@v.loewis.de> <468DAA12.4000707@v.loewis.de>
	<A4A957B4-2045-416C-9B71-2FDF6DE52680@zope.com>
Message-ID: <468EBC07.6010607@v.loewis.de>

>> I now do, somewhat. Apparently, when you discard a cursor object
>> in psycopg, and create a new one, that doesn't necessarily start
>> a new transaction. So if there was some SQL error in the connection,
>> it stops accepting further SQL statements.
>>
>> I fixed that by rolling back the connection after each request,
>> and before each new request.
>>
>> What I don't understand is why there was an error in the first
>> place (or what that error was).
> 
> OK, this probably isn't helpful, but I can't help asking an obvious
> question.  Did something change in the software other than a switch from
> mod_python to FastCGI?

Yes, I also made the connections to Postgres persistent, rather than
opening a new connection on each request.

Regards,
Martin

From jim at zope.com  Sat Jul  7 00:06:29 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 6 Jul 2007 18:06:29 -0400
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To: <468EBC07.6010607@v.loewis.de>
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>
	<468DA48B.2020008@v.loewis.de> <468DAA12.4000707@v.loewis.de>
	<A4A957B4-2045-416C-9B71-2FDF6DE52680@zope.com>
	<468EBC07.6010607@v.loewis.de>
Message-ID: <0979795A-1F22-4C2E-871D-90F16C3494F1@zope.com>


On Jul 6, 2007, at 6:02 PM, Martin v. L?wis wrote:

>>> I now do, somewhat. Apparently, when you discard a cursor object
>>> in psycopg, and create a new one, that doesn't necessarily start
>>> a new transaction. So if there was some SQL error in the connection,
>>> it stops accepting further SQL statements.
>>>
>>> I fixed that by rolling back the connection after each request,
>>> and before each new request.
>>>
>>> What I don't understand is why there was an error in the first
>>> place (or what that error was).
>>
>> OK, this probably isn't helpful, but I can't help asking an obvious
>> question.  Did something change in the software other than a  
>> switch from
>> mod_python to FastCGI?
>
> Yes, I also made the connections to Postgres persistent, rather than
> opening a new connection on each request.

Ah, OK, that explains it.  This is a reasonable thing to do from a  
performance point of view.  Thanks for plugging away at this. :)

(Of course it's too bad we don't have a better way of testing  
changes. Oh well.)

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From martin at v.loewis.de  Sat Jul  7 00:15:10 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 07 Jul 2007 00:15:10 +0200
Subject: [Catalog-sig] [PSF-Members]  Cheeseshop performance improved
In-Reply-To: <20070706214527.GR28082@tummy.com>
References: <467CC2E1.3010708@v.loewis.de>	<E7852F4F-9599-449F-822F-C7F33783DF9F@zope.com>	<DCC2A626-BCA9-4CC8-9226-77FA12CE86F3@zope.com>	<46801FDC.4060502@v.loewis.de>	<65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com>	<46802A10.8080205@v.loewis.de>	<A4C9B0C7-1420-4398-B85A-C601C18204C6@zope.com>	<200706252144.l5PLi7cs032424@theraft.openend.se>	<20070626105201.GA14025@tummy.com>	<F7A9A3D2-7F14-4CEC-ABA4-8C3DF87F4D43@zope.com>
	<20070706214527.GR28082@tummy.com>
Message-ID: <468EBEEE.9010404@v.loewis.de>

> I never heard anything from Thomas, which I would think would be the right
> person to run this through, as I really don't know anything about the
> arrangement we have with XS4ALL.  I guess we'd also need to get the PSF to
> approve this, though I'd imagine that'd be little more than a formality.
> 
> If we don't have any response from Thomas in a bit, I can try contacting
> XS4ALL directly and see if they can give us any ideas.

I expect such a project to complete in a matter of months rather
than a matter of days. It took a year or so before the current set of
machines was actively being used (IIRC).

> However, I believe that Martin also thinks that with his FastCGi changes it
> should be happy now as is...

Indeed. If there are further complaints on the performance, I'd like to
hear them (preferably with a way for reproducing them). There is still
stuff that can be done to improve PyPI further, such as better usage of
SQL.

Regards,
Martin

From martin at v.loewis.de  Sat Jul  7 00:22:42 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 07 Jul 2007 00:22:42 +0200
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To: <0979795A-1F22-4C2E-871D-90F16C3494F1@zope.com>
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>
	<468DA48B.2020008@v.loewis.de> <468DAA12.4000707@v.loewis.de>
	<A4A957B4-2045-416C-9B71-2FDF6DE52680@zope.com>
	<468EBC07.6010607@v.loewis.de>
	<0979795A-1F22-4C2E-871D-90F16C3494F1@zope.com>
Message-ID: <468EC0B2.9070903@v.loewis.de>

> Ah, OK, that explains it.  This is a reasonable thing to do from a
> performance point of view.  Thanks for plugging away at this. :)
> 
> (Of course it's too bad we don't have a better way of testing changes.
> Oh well.)

If there were volunteer testers, it would be possible to test changes
for some period of time. Such testers would have to build themselves
a PyPI installation, and then checkout all changes that have been
committed (or install them from a tracker where they float around).

Alternatively, if somebody contributed a unit test suite, certain
problems might get caught.

In the specific case, I tested whether PyPI "works" on my local
installation, and I apparently didn't not manage to trigger the
problem. My guess is that it was originally triggered by some
failing concurrent access, which is really hard to test for.

Regards,
Martin

From martin at v.loewis.de  Sat Jul  7 00:40:19 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 07 Jul 2007 00:40:19 +0200
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To: <20070706171848.268C23A4046@sparrow.telecommunity.com>
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>
	<468DA48B.2020008@v.loewis.de> <468DAA12.4000707@v.loewis.de>
	<A4A957B4-2045-416C-9B71-2FDF6DE52680@zope.com>
	<20070706171848.268C23A4046@sparrow.telecommunity.com>
Message-ID: <468EC4D3.5030108@v.loewis.de>

> That wouldn't be necessary for this to become a problem.  If PyPI was
> CGI before, then any sort of transient SQL problem wouldn't have had
> this effect, because the DB connection would've been closed at the end
> of each request.  So, it's probably an existing SQL error in PyPI.

That would be my guess. Another possibility might have been that
there was a Python exception, in which case PyPI would not have invoked
.commit on the transaction (so apparently, the transaction would have
been kept open). I'm unsure whether this might cause problems for
subsequent actions. Still, no such exceptions were reported...

In any case, I now do a .rollback in the case of an exception, and
a .rollback before processing a new request. I'd like to get some
confirmation that this is a sensible approach (or what else best
practice is).

Regards,
Martin

From ianb at colorstudy.com  Sat Jul  7 00:44:51 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Fri, 06 Jul 2007 17:44:51 -0500
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To: <468EC0B2.9070903@v.loewis.de>
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>	<468DA48B.2020008@v.loewis.de>
	<468DAA12.4000707@v.loewis.de>	<A4A957B4-2045-416C-9B71-2FDF6DE52680@zope.com>	<468EBC07.6010607@v.loewis.de>	<0979795A-1F22-4C2E-871D-90F16C3494F1@zope.com>
	<468EC0B2.9070903@v.loewis.de>
Message-ID: <468EC5E3.2040903@colorstudy.com>

Martin v. L?wis wrote:
>> Ah, OK, that explains it.  This is a reasonable thing to do from a
>> performance point of view.  Thanks for plugging away at this. :)
>>
>> (Of course it's too bad we don't have a better way of testing changes.
>> Oh well.)
> 
> If there were volunteer testers, it would be possible to test changes
> for some period of time. Such testers would have to build themselves
> a PyPI installation, and then checkout all changes that have been
> committed (or install them from a tracker where they float around).
> 
> Alternatively, if somebody contributed a unit test suite, certain
> problems might get caught.
> 
> In the specific case, I tested whether PyPI "works" on my local
> installation, and I apparently didn't not manage to trigger the
> problem. My guess is that it was originally triggered by some
> failing concurrent access, which is really hard to test for.

Are exceptions being logged, and actively sent to someone who can handle 
them?  This particular problem sounds like it is fairly deployment- and 
load-specific, so testing probably wouldn't have found it anyway.

-- 
Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org
             : Write code, do good : http://topp.openplans.org/careers

From pje at telecommunity.com  Sat Jul  7 01:20:37 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 06 Jul 2007 19:20:37 -0400
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To: <468EC4D3.5030108@v.loewis.de>
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>
	<468DA48B.2020008@v.loewis.de> <468DAA12.4000707@v.loewis.de>
	<A4A957B4-2045-416C-9B71-2FDF6DE52680@zope.com>
	<20070706171848.268C23A4046@sparrow.telecommunity.com>
	<468EC4D3.5030108@v.loewis.de>
Message-ID: <20070706231831.B83783A405F@sparrow.telecommunity.com>

At 12:40 AM 7/7/2007 +0200, Martin v. L?wis wrote:
> > That wouldn't be necessary for this to become a problem.  If PyPI was
> > CGI before, then any sort of transient SQL problem wouldn't have had
> > this effect, because the DB connection would've been closed at the end
> > of each request.  So, it's probably an existing SQL error in PyPI.
>
>That would be my guess. Another possibility might have been that
>there was a Python exception, in which case PyPI would not have invoked
>.commit on the transaction (so apparently, the transaction would have
>been kept open). I'm unsure whether this might cause problems for
>subsequent actions. Still, no such exceptions were reported...
>
>In any case, I now do a .rollback in the case of an exception, and
>a .rollback before processing a new request. I'd like to get some
>confirmation that this is a sensible approach (or what else best
>practice is).

The best practice is ensuring that either a commit or rollback 
happens at the end of each web request that uses the 
connection.  Then, there's no chance of a failed but not rolled-back 
transaction continuing to hold locks in the database.

In PostgreSQL's case, the MVCC would prevent such a transaction from 
blocking any read-only transactions, of course.

What you're doing is quite close to best practice; if I understand 
you correctly, it differs only in the case of what happens if there 
is a program error resulting in failure to commit or abort.


From richardjones at optushome.com.au  Sat Jul  7 01:30:46 2007
From: richardjones at optushome.com.au (Richard Jones)
Date: Sat, 7 Jul 2007 09:30:46 +1000
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To: <468EC5E3.2040903@colorstudy.com>
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>
	<468EC0B2.9070903@v.loewis.de> <468EC5E3.2040903@colorstudy.com>
Message-ID: <200707070930.46318.richardjones@optushome.com.au>

On Sat, 7 Jul 2007, Ian Bicking wrote:
> Are exceptions being logged, and actively sent to someone who can handle
> them?  This particular problem sounds like it is fairly deployment- and
> load-specific, so testing probably wouldn't have found it anyway.

Errors are currently emailed to myself and AMK. This is controlled by the 
config on xminiez, so others may receive the error emails if they desire.


    Richard

From renesd at gmail.com  Sat Jul  7 03:22:27 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Sat, 7 Jul 2007 11:22:27 +1000
Subject: [Catalog-sig] [PSF-Members] Cheeseshop performance improved
In-Reply-To: <468EBEEE.9010404@v.loewis.de>
References: <467CC2E1.3010708@v.loewis.de> <46801FDC.4060502@v.loewis.de>
	<65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com>
	<46802A10.8080205@v.loewis.de>
	<A4C9B0C7-1420-4398-B85A-C601C18204C6@zope.com>
	<200706252144.l5PLi7cs032424@theraft.openend.se>
	<20070626105201.GA14025@tummy.com>
	<F7A9A3D2-7F14-4CEC-ABA4-8C3DF87F4D43@zope.com>
	<20070706214527.GR28082@tummy.com> <468EBEEE.9010404@v.loewis.de>
Message-ID: <64ddb72c0707061822x615de207qf1a0520f23ee801d@mail.gmail.com>

Hi,

yeah, the sql can be improved.

A lot of the queries cause a sequential scan of all the rows in the
journal and release tables.

I think the cause of this is that one of the tables does not have a
primary key, so postgresql can't optimize the query.  Even if the
table had an incrementing numeric id field, then I think the joins
could be sped up.  I haven't tested this yet, but maybe that'd help -
or maybe there would need to be more changes needed.  Postgresql
definitely needs a PK on each table though.

ps, I'm going to try and finish off that caching/static file work I've
been working on(more on that later).  I guess I'll need to test things
a little differently with fastcgi.  How did you set up a fastcgi pypi?

Cheers,


On 7/7/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > I never heard anything from Thomas, which I would think would be the right
> > person to run this through, as I really don't know anything about the
> > arrangement we have with XS4ALL.  I guess we'd also need to get the PSF to
> > approve this, though I'd imagine that'd be little more than a formality.
> >
> > If we don't have any response from Thomas in a bit, I can try contacting
> > XS4ALL directly and see if they can give us any ideas.
>
> I expect such a project to complete in a matter of months rather
> than a matter of days. It took a year or so before the current set of
> machines was actively being used (IIRC).
>
> > However, I believe that Martin also thinks that with his FastCGi changes it
> > should be happy now as is...
>
> Indeed. If there are further complaints on the performance, I'd like to
> hear them (preferably with a way for reproducing them). There is still
> stuff that can be done to improve PyPI further, such as better usage of
> SQL.
>
> Regards,
> Martin
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>

From renesd at gmail.com  Sat Jul  7 06:24:53 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Sat, 7 Jul 2007 14:24:53 +1000
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
Message-ID: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>

Hello,

here is the start of an apache config for using static files if they
exist, and the person is not logged in.

The idea will be to have a www/static/cheeseshop.python.org/pypi/
directory filled with the relevant cached files.

Here's the apache config so far.  It checks to see if the person is
authorized, and if they are it does not use the static files.

There are a couple of special cases... ie the /pypi and pypi urls.

Now I just need to finish off the static file generation code.  It
needs a tool which can run every minute or so, which will look for any
changes.  If it finds changes it will  update just those files.  It
will generate the files in a separate directory first, and then move
them in - so people don't download half generated files.  It will
optionally be able to regenerate all the static files - incase there
are database, or template changes.


Of course the config will have to change a little bit for using fcgi
instead of modpython... but there shouldn't be too much to change.

I've also updated the http://wiki.python.org/moin/CheeseShopDev page
with some things I noticed when installing the cheeseshop again on my
laptop.  Mainly dependencies, and missing config steps.


NameVirtualHost 192.168.0.3
<VirtualHost 192.168.0.3>
	ServerAdmin webmaster at localhost
	ServerName gracerr.pretendpaper.com


	DocumentRoot /home/rene/dev/python/cheeseshop/packages/trunk/www/

       # Redirect RSS to a static file
       Alias  /pypi/?:action=rss /data/www/pypi/pypi_rss.xml

	<Directory /home/rene/dev/python/cheeseshop/packages/trunk/www/>
		Options Indexes FollowSymLinks MultiViews
		AllowOverride None
		Order allow,deny
		allow from all

	</Directory>




       AddHandler cgi-script .cgi
       <Directory /data/packages/>
         Options Indexes
       </Directory>



       <Location /pypi2>
         SetHandler mod_python
         #PythonPath "['/data/pypi/src/pypi']+sys.path"
         PythonPath
"['/home/rene/dev/python/cheeseshop/packages/trunk/pypi']+sys.path"
         PythonHandler pypi::handle
         PythonDebug On

         # 2007-06-15 -- POSTs to /pypi every second
         deny from 69.55.232.188


       </Location>

       # Rewrite rules
       RewriteEngine on

       # if the authorization header is empty, redirect.
       RewriteCond %{HTTP:authorization}  ^$
       RewriteRule ^(.*)pypi/$ /static/package_index.html [L]
       #RewriteRule ^(.*)pypi$ /static/front-page.html [L]

       # always make the /pypi empty one go straight through.
       RewriteRule ^(.*)pypi$ /pypi2 [PT]


        # a file, or a directory, and empty authorization header.

        RewriteCond %{HTTP:authorization}  ^$
	RewriteCond /home/rene/dev/python/cheeseshop/packages/trunk/www/static/gracerr.pretendpaper.com/%{REQUEST_FILENAME}
-f
	RewriteRule ^(.*)pypi/(.*) /static/gracerr.pretendpaper.com/pypi/$2 [PT]

        RewriteCond %{HTTP:authorization}  ^$
	RewriteCond /home/rene/dev/python/cheeseshop/packages/trunk/www/static/gracerr.pretendpaper.com/%{REQUEST_FILENAME}
-d
	RewriteRule ^(.*)pypi/(.*) /static/gracerr.pretendpaper.com/pypi/$2 [PT]


	# Look here instead...
	RewriteRule (.*) /pypi2/$1 [PT]



       # Point to package directory
       RewriteRule /packages(/.*)?$ /data/packages$1 [last]
       RewriteRule /icons/(.*$) /usr/share/apache2/icons/$1 [last]

       RedirectMatch permanent ^/$ "http://gracerr.pretendpaper.com/pypi"



  RewriteLog /var/log/apache2/rewrite.log

  RewriteLogLevel 9





	ErrorLog /var/log/apache2/grace_error.log

	# Possible values include: debug, info, notice, warn, error, crit,
	# alert, emerg.
	#LogLevel warn
	LogLevel debug

	CustomLog /var/log/apache2/grace_access.log combined
	#ServerSignature On
	

        # mkdir /var/tmp/proxy2/cheeseshop
	# chown www-data: /var/tmp/proxy2/cheeseshop

#	CacheRoot "/var/tmp/proxy2/cheeseshop"
#	CacheEnable disk /
#	CacheSize 4000000
#	# CacheMinFileSize setting this so that 403 forbidden pages are not cached.
#	CacheMinFileSize 400
#	CacheDirLevels 5
#	CacheDirLength 3
#	#CacheGcInterval 4
#	CacheMaxExpire 24
#	CacheLastModifiedFactor 0.1
#	CacheDefaultExpire 1
#	#CacheForceCompletion 100



</VirtualHost>

From martin at v.loewis.de  Sat Jul  7 08:30:53 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 07 Jul 2007 08:30:53 +0200
Subject: [Catalog-sig] psycoph errors from pypi
In-Reply-To: <468EC5E3.2040903@colorstudy.com>
References: <2AF93E84-A1F3-4B18-9D9B-6F1A6E25B75B@zope.com>	<468DA48B.2020008@v.loewis.de>
	<468DAA12.4000707@v.loewis.de>	<A4A957B4-2045-416C-9B71-2FDF6DE52680@zope.com>	<468EBC07.6010607@v.loewis.de>	<0979795A-1F22-4C2E-871D-90F16C3494F1@zope.com>
	<468EC0B2.9070903@v.loewis.de> <468EC5E3.2040903@colorstudy.com>
Message-ID: <468F331D.1080904@v.loewis.de>

> Are exceptions being logged, and actively sent to someone who can handle
> them?  This particular problem sounds like it is fairly deployment- and
> load-specific, so testing probably wouldn't have found it anyway.

They are sent by email. AFAICT, they are not logged.

Regards,
Martin


From martin at v.loewis.de  Sat Jul  7 08:44:21 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 07 Jul 2007 08:44:21 +0200
Subject: [Catalog-sig] [PSF-Members] Cheeseshop performance improved
In-Reply-To: <64ddb72c0707061822x615de207qf1a0520f23ee801d@mail.gmail.com>
References: <467CC2E1.3010708@v.loewis.de>
	<46801FDC.4060502@v.loewis.de>	<65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com>	<46802A10.8080205@v.loewis.de>	<A4C9B0C7-1420-4398-B85A-C601C18204C6@zope.com>	<200706252144.l5PLi7cs032424@theraft.openend.se>	<20070626105201.GA14025@tummy.com>	<F7A9A3D2-7F14-4CEC-ABA4-8C3DF87F4D43@zope.com>	<20070706214527.GR28082@tummy.com>
	<468EBEEE.9010404@v.loewis.de>
	<64ddb72c0707061822x615de207qf1a0520f23ee801d@mail.gmail.com>
Message-ID: <468F3645.1030000@v.loewis.de>

> A lot of the queries cause a sequential scan of all the rows in the
> journal and release tables.
> 
> I think the cause of this is that one of the tables does not have a
> primary key, so postgresql can't optimize the query.  Even if the
> table had an incrementing numeric id field, then I think the joins
> could be sped up.  I haven't tested this yet, but maybe that'd help -
> or maybe there would need to be more changes needed.  Postgresql
> definitely needs a PK on each table though.

Not definitely - and index is enough. A PK only adds an additional
constraint, and does not contribute in itself to performance.
In any case, I plan to add a name-version index to release_classifiers,
as the browsing often looks into release_classifiers by name and
version.

> ps, I'm going to try and finish off that caching/static file work I've
> been working on(more on that later).  I guess I'll need to test things
> a little differently with fastcgi.  How did you set up a fastcgi pypi?

FastCgiServer /data/pypi/src/pypi/pypi.fcgi  -idle-timeout 60 -processes 4

then

   # Trick Apache in providing Basic-Auth to pypi.fcgi
   RewriteCond %{HTTP:Authorization}  ^(.+)$
   RewriteRule ^/pypi(.*) /data/pypi/src/pypi/pypi.fcgi$1
[e=HTTP_CGI_AUTHORIZATION:%1,l]
   ScriptAlias /pypi /data/pypi/src/pypi/pypi.fcgi

Regards,
Martin

From martin at v.loewis.de  Sat Jul  7 09:12:20 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 07 Jul 2007 09:12:20 +0200
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
Message-ID: <468F3CD4.1070501@v.loewis.de>

> Now I just need to finish off the static file generation code.  It
> needs a tool which can run every minute or so, which will look for any
> changes.

Would it be possible to trigger that explicitly by a write operation?
I'm doubtful about cron jobs for that kind of stuff - they run both
too often and too infrequent. It's too often because most of the time,
nothing changes, and too infrequent, because the user making the change
won't see it, and wonders where it got lost (they will see the change
as long they are logged in, then they log out, and the release is not
there).

IIUC, every addition to the journals should trigger a change, and then
the updating of the download counters. There are also changes to the
templates, but it would be ok if one would have to trigger regeneration
manually in this case.

Regards,
Martin

From renesd at gmail.com  Sat Jul  7 09:38:18 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Sat, 7 Jul 2007 17:38:18 +1000
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <468F3CD4.1070501@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
Message-ID: <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>

Yeah, that could be triggered then.

For the case of multiple changes at a similar time, we could add some
checks to make sure the updater process is only running once.
Otherwise for the case when there are a few changes happening at a
time, the machine would get unnecessarily overloaded.


On 7/7/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > Now I just need to finish off the static file generation code.  It
> > needs a tool which can run every minute or so, which will look for any
> > changes.
>
> Would it be possible to trigger that explicitly by a write operation?
> I'm doubtful about cron jobs for that kind of stuff - they run both
> too often and too infrequent. It's too often because most of the time,
> nothing changes, and too infrequent, because the user making the change
> won't see it, and wonders where it got lost (they will see the change
> as long they are logged in, then they log out, and the release is not
> there).
>
> IIUC, every addition to the journals should trigger a change, and then
> the updating of the download counters. There are also changes to the
> templates, but it would be ok if one would have to trigger regeneration
> manually in this case.
>
> Regards,
> Martin
>

From renesd at gmail.com  Sat Jul  7 09:41:54 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Sat, 7 Jul 2007 17:41:54 +1000
Subject: [Catalog-sig] [PSF-Members] Cheeseshop performance improved
In-Reply-To: <468F3645.1030000@v.loewis.de>
References: <467CC2E1.3010708@v.loewis.de> <46802A10.8080205@v.loewis.de>
	<A4C9B0C7-1420-4398-B85A-C601C18204C6@zope.com>
	<200706252144.l5PLi7cs032424@theraft.openend.se>
	<20070626105201.GA14025@tummy.com>
	<F7A9A3D2-7F14-4CEC-ABA4-8C3DF87F4D43@zope.com>
	<20070706214527.GR28082@tummy.com> <468EBEEE.9010404@v.loewis.de>
	<64ddb72c0707061822x615de207qf1a0520f23ee801d@mail.gmail.com>
	<468F3645.1030000@v.loewis.de>
Message-ID: <64ddb72c0707070041n5eb565c1jdaa25e4c9d583641@mail.gmail.com>

Thanks.

I thought because of the types of joins being done postgresql needed a
primary key - but maybe you can get them working with just some more
indices.


On 7/7/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > A lot of the queries cause a sequential scan of all the rows in the
> > journal and release tables.
> >
> > I think the cause of this is that one of the tables does not have a
> > primary key, so postgresql can't optimize the query.  Even if the
> > table had an incrementing numeric id field, then I think the joins
> > could be sped up.  I haven't tested this yet, but maybe that'd help -
> > or maybe there would need to be more changes needed.  Postgresql
> > definitely needs a PK on each table though.
>
> Not definitely - and index is enough. A PK only adds an additional
> constraint, and does not contribute in itself to performance.
> In any case, I plan to add a name-version index to release_classifiers,
> as the browsing often looks into release_classifiers by name and
> version.
>
> > ps, I'm going to try and finish off that caching/static file work I've
> > been working on(more on that later).  I guess I'll need to test things
> > a little differently with fastcgi.  How did you set up a fastcgi pypi?
>
> FastCgiServer /data/pypi/src/pypi/pypi.fcgi  -idle-timeout 60 -processes 4
>
> then
>
>    # Trick Apache in providing Basic-Auth to pypi.fcgi
>    RewriteCond %{HTTP:Authorization}  ^(.+)$
>    RewriteRule ^/pypi(.*) /data/pypi/src/pypi/pypi.fcgi$1
> [e=HTTP_CGI_AUTHORIZATION:%1,l]
>    ScriptAlias /pypi /data/pypi/src/pypi/pypi.fcgi
>
> Regards,
> Martin
>

From jafo at tummy.com  Sat Jul  7 10:18:13 2007
From: jafo at tummy.com (Sean Reifschneider)
Date: Sat, 7 Jul 2007 02:18:13 -0600
Subject: [Catalog-sig] [PSF-Members]  Cheeseshop performance improved
In-Reply-To: <468EBEEE.9010404@v.loewis.de>
References: <DCC2A626-BCA9-4CC8-9226-77FA12CE86F3@zope.com>
	<46801FDC.4060502@v.loewis.de>
	<65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com>
	<46802A10.8080205@v.loewis.de>
	<A4C9B0C7-1420-4398-B85A-C601C18204C6@zope.com>
	<200706252144.l5PLi7cs032424@theraft.openend.se>
	<20070626105201.GA14025@tummy.com>
	<F7A9A3D2-7F14-4CEC-ABA4-8C3DF87F4D43@zope.com>
	<20070706214527.GR28082@tummy.com> <468EBEEE.9010404@v.loewis.de>
Message-ID: <20070707081813.GS28082@tummy.com>

On Sat, Jul 07, 2007 at 12:15:10AM +0200, "Martin v. L?wis" wrote:
>I expect such a project to complete in a matter of months rather
>than a matter of days. It took a year or so before the current set of

I believe that Jim was referring to the memory upgrade of ximinez, not the
getting creosote replaced with a new box.  The memory upgrade should tale
little if any of our time.

Thanks,
Sean
-- 
 moshez always wanted to invent a compression scheme called "feather",
 so he could tar and feather his files.
Sean Reifschneider, Member of Technical Staff <jafo at tummy.com>
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability


From renesd at gmail.com  Sat Jul  7 11:03:24 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Sat, 7 Jul 2007 19:03:24 +1000
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
Message-ID: <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>

Hi,

I tried using memcached for caching the database queries - for logged
in users.  It did speed it up a little, but not that much.  It turns
out that the templates take most of the time - at least on my machine.
 I guess pagetemplates are not that quick?

Here's the modified files if you want to try it out yourself:
http://rene.f0o.com/~rene/stuff/store.py
http://rene.f0o.com/~rene/stuff/webui.py

I just tried out on the queries that /pypi /pypi/ use.

There's some timing in the webui that gets written to a file in /tmp/asdfsdaf

For concurrent access then memcached will make more of a difference though.

Memcache could help even for logged in people, but I think replacing
the template language with something faster will have the most effect.


Cheers,



On 7/7/07, Ren? Dudfield <renesd at gmail.com> wrote:
> Yeah, that could be triggered then.
>
> For the case of multiple changes at a similar time, we could add some
> checks to make sure the updater process is only running once.
> Otherwise for the case when there are a few changes happening at a
> time, the machine would get unnecessarily overloaded.
>
>
> On 7/7/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > > Now I just need to finish off the static file generation code.  It
> > > needs a tool which can run every minute or so, which will look for any
> > > changes.
> >
> > Would it be possible to trigger that explicitly by a write operation?
> > I'm doubtful about cron jobs for that kind of stuff - they run both
> > too often and too infrequent. It's too often because most of the time,
> > nothing changes, and too infrequent, because the user making the change
> > won't see it, and wonders where it got lost (they will see the change
> > as long they are logged in, then they log out, and the release is not
> > there).
> >
> > IIUC, every addition to the journals should trigger a change, and then
> > the updating of the download counters. There are also changes to the
> > templates, but it would be ok if one would have to trigger regeneration
> > manually in this case.
> >
> > Regards,
> > Martin
> >
>

From jim at zope.com  Sat Jul  7 16:30:24 2007
From: jim at zope.com (Jim Fulton)
Date: Sat, 7 Jul 2007 10:30:24 -0400
Subject: [Catalog-sig] [PSF-Members] Cheeseshop performance improved
In-Reply-To: <64ddb72c0707061822x615de207qf1a0520f23ee801d@mail.gmail.com>
References: <467CC2E1.3010708@v.loewis.de> <46801FDC.4060502@v.loewis.de>
	<65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com>
	<46802A10.8080205@v.loewis.de>
	<A4C9B0C7-1420-4398-B85A-C601C18204C6@zope.com>
	<200706252144.l5PLi7cs032424@theraft.openend.se>
	<20070626105201.GA14025@tummy.com>
	<F7A9A3D2-7F14-4CEC-ABA4-8C3DF87F4D43@zope.com>
	<20070706214527.GR28082@tummy.com> <468EBEEE.9010404@v.loewis.de>
	<64ddb72c0707061822x615de207qf1a0520f23ee801d@mail.gmail.com>
Message-ID: <2F7122AD-4F6C-4714-9955-0E12AF8A6864@zope.com>


On Jul 6, 2007, at 9:22 PM, Ren? Dudfield wrote:
...
> ps, I'm going to try and finish off that caching/static file work I've
> been working on(more on that later).

Yay!

>   I guess I'll need to test things
> a little differently with fastcgi.  How did you set up a fastcgi pypi?

Does it matter? Couldn't you just test with CGI?

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Sat Jul  7 17:19:19 2007
From: jim at zope.com (Jim Fulton)
Date: Sat, 7 Jul 2007 11:19:19 -0400
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
Message-ID: <F4E8DC67-7ECB-4F0E-BC40-943926334742@zope.com>


On Jul 7, 2007, at 12:24 AM, Ren? Dudfield wrote:
...
> Now I just need to finish off the static file generation code.  It
> needs a tool which can run every minute or so, which will look for any
> changes.

Why not write the files when the underlying packages change?

I don't like polling for two reasons:

- New pages are out of date for up to the polling interval.  This is  
especially annoying for someone who uploads a package and wants to be  
able to access it immediately.

- Polling all of the pages to see what's changed doesn't seem  
scalable to me.

...

> I've also updated the http://wiki.python.org/moin/CheeseShopDev page
> with some things I noticed when installing the cheeseshop again on my
> laptop.  Mainly dependencies, and missing config steps.

Thanks!

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From martin at v.loewis.de  Sat Jul  7 18:39:42 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 07 Jul 2007 18:39:42 +0200
Subject: [Catalog-sig] [PSF-Members]  Cheeseshop performance improved
In-Reply-To: <20070707081813.GS28082@tummy.com>
References: <DCC2A626-BCA9-4CC8-9226-77FA12CE86F3@zope.com>	<46801FDC.4060502@v.loewis.de>	<65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com>	<46802A10.8080205@v.loewis.de>	<A4C9B0C7-1420-4398-B85A-C601C18204C6@zope.com>	<200706252144.l5PLi7cs032424@theraft.openend.se>	<20070626105201.GA14025@tummy.com>	<F7A9A3D2-7F14-4CEC-ABA4-8C3DF87F4D43@zope.com>	<20070706214527.GR28082@tummy.com>
	<468EBEEE.9010404@v.loewis.de> <20070707081813.GS28082@tummy.com>
Message-ID: <468FC1CE.8080708@v.loewis.de>

>> I expect such a project to complete in a matter of months rather
>> than a matter of days. It took a year or so before the current set of
> 
> I believe that Jim was referring to the memory upgrade of ximinez, not the
> getting creosote replaced with a new box.  The memory upgrade should tale
> little if any of our time.

Ah, ok. If you would like to find the right person at XS4ALL to talk to,
please go ahead - else I could try myself.

Regards,
Martin

From martin at v.loewis.de  Sat Jul  7 18:43:39 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 07 Jul 2007 18:43:39 +0200
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
Message-ID: <468FC2BB.7030607@v.loewis.de>

> I tried using memcached for caching the database queries - for logged
> in users.  It did speed it up a little, but not that much.  It turns
> out that the templates take most of the time - at least on my machine.

For the majority of pages generated through page templates, I think
the static generation would be fine. I'm looking primarily into the
browse interface at the moment.

> There's some timing in the webui that gets written to a file in /tmp/asdfsdaf
> 
> For concurrent access then memcached will make more of a difference though.
> 
> Memcache could help even for logged in people, but I think replacing
> the template language with something faster will have the most effect.

I'm quite skeptical on caching in general (even about the static page
generation). It *should* be possible to make it fast enough so that
it doesn't need caching. I consider caching a work-around, not a
solution - and one with severe drawbacks.

Regards,
Martin

From jim at zope.com  Sat Jul  7 19:48:50 2007
From: jim at zope.com (Jim Fulton)
Date: Sat, 7 Jul 2007 13:48:50 -0400
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <468FC2BB.7030607@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
Message-ID: <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>


On Jul 7, 2007, at 12:43 PM, Martin v. L?wis wrote:
...
> I'm quite skeptical on caching in general (even about the static page
> generation). It *should* be possible to make it fast enough so that
> it doesn't need caching.

Sure, with more hardware than we want to afford.

> I consider caching a work-around, not a
> solution - and one with severe drawbacks.

The pages we're talking about are static.  They change at well-known  
times. IMO, It's crazy to serve static content dynamically when it's  
easy to serve it statically.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jafo at tummy.com  Sat Jul  7 20:56:30 2007
From: jafo at tummy.com (Sean Reifschneider)
Date: Sat, 7 Jul 2007 12:56:30 -0600
Subject: [Catalog-sig] [PSF-Members]  Cheeseshop performance improved
In-Reply-To: <468FC1CE.8080708@v.loewis.de>
References: <65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com>
	<46802A10.8080205@v.loewis.de>
	<A4C9B0C7-1420-4398-B85A-C601C18204C6@zope.com>
	<200706252144.l5PLi7cs032424@theraft.openend.se>
	<20070626105201.GA14025@tummy.com>
	<F7A9A3D2-7F14-4CEC-ABA4-8C3DF87F4D43@zope.com>
	<20070706214527.GR28082@tummy.com> <468EBEEE.9010404@v.loewis.de>
	<20070707081813.GS28082@tummy.com> <468FC1CE.8080708@v.loewis.de>
Message-ID: <20070707185630.GV28082@tummy.com>

On Sat, Jul 07, 2007 at 06:39:42PM +0200, "Martin v. L?wis" wrote:
>Ah, ok. If you would like to find the right person at XS4ALL to talk to,
>please go ahead - else I could try myself.

I've sent a request to the "sales" e-mail contact explaining what we're
trying to do and asking for direction.

Thanks,
Sean
-- 
 You know you're in Canada when:  You see a flyer advertising a polka-fest
 at the curling rink.
Sean Reifschneider, Member of Technical Staff <jafo at tummy.com>
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability


From thomas at python.org  Sat Jul  7 21:38:00 2007
From: thomas at python.org (Thomas Wouters)
Date: Sat, 7 Jul 2007 12:38:00 -0700
Subject: [Catalog-sig] [PSF-Members] Cheeseshop performance improved
In-Reply-To: <20070707185630.GV28082@tummy.com>
References: <65F50ECE-9555-4F7E-B450-4ECD19E18795@zope.com>
	<A4C9B0C7-1420-4398-B85A-C601C18204C6@zope.com>
	<200706252144.l5PLi7cs032424@theraft.openend.se>
	<20070626105201.GA14025@tummy.com>
	<F7A9A3D2-7F14-4CEC-ABA4-8C3DF87F4D43@zope.com>
	<20070706214527.GR28082@tummy.com> <468EBEEE.9010404@v.loewis.de>
	<20070707081813.GS28082@tummy.com> <468FC1CE.8080708@v.loewis.de>
	<20070707185630.GV28082@tummy.com>
Message-ID: <9e804ac0707071238v6664e9c1xb954fe805f5ebb15@mail.gmail.com>

On 7/7/07, Sean Reifschneider <jafo at tummy.com> wrote:
>
> On Sat, Jul 07, 2007 at 06:39:42PM +0200, "Martin v. L?wis" wrote:
> >Ah, ok. If you would like to find the right person at XS4ALL to talk to,
> >please go ahead - else I could try myself.
>
> I've sent a request to the "sales" e-mail contact explaining what we're
> trying to do and asking for direction.


I doubt they can figure out what to do, frankly, since we're not an official
sales customer. But who knows, they might surprise me ;) I sent out an email
asking for extra memory last week, but I've been busy with work and
travelling (first Mountain View for Google, now Vilnius for EuroPython) and
haven't had a chance to find out if the people I asked are even in the
country right now. If you don't hear back from sales, let me know and I'll
ask around more.

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/catalog-sig/attachments/20070707/0500495c/attachment.html 

From martin at v.loewis.de  Sat Jul  7 22:24:59 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 07 Jul 2007 22:24:59 +0200
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
Message-ID: <468FF69B.2090503@v.loewis.de>

Jim Fulton schrieb:
> ...
>> I'm quite skeptical on caching in general (even about the static page
>> generation). It *should* be possible to make it fast enough so that
>> it doesn't need caching.
> 
> Sure, with more hardware than we want to afford.

So you are saying it's not fast enough already?

Regards,
Martin

From renesd at gmail.com  Sun Jul  8 05:14:56 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Sun, 8 Jul 2007 13:14:56 +1000
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <F4E8DC67-7ECB-4F0E-BC40-943926334742@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<F4E8DC67-7ECB-4F0E-BC40-943926334742@zope.com>
Message-ID: <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>

Hello,

Cool, ok.  Let's start with event based updating of the static files.


I need to make this tool in this way anyway though.  But we can either
set it up to work with polling, or event based.  We can start with
event based and switch to polling later if needed.

Since none of the files exists at the moment, the tool will be
needed to generate them initially.  Also if templates change, or the
database changes - then the static pages may need regenerating.


Polling is just one sql statement to see if something has changed.
You do this once, no matter how many things have changed.  It's a
really quick, operation if nothing has changed.

Polling ends up being faster if you are constantly having to do things
all the time anyway.  It's what network drivers do these days because
they realise that there are a constant stream of events(interupts)
anyway - so might as well deal with them at a fixed interval.

Logged in users will not see the static file anyway - since they are
logged in, they get to see the dynamically generated stuff.

Imagine this case:
2-3 users are updating their packages, at a similar time.  The main
index then gets regenerated 3 times, rather than once.  The more
people who are changing things the more this method works.  If there
are 20 people changing things at the same time, then there is still
only one update of the main index page.  However since the cheeseshop
only gets updated about 6 times daily, event based is probably better
for the moment.


Anyway... I'm just making the tool which can be used on demand, or at
regular timings.


Cheers,



On 7/8/07, Jim Fulton <jim at zope.com> wrote:
>
> On Jul 7, 2007, at 12:24 AM, Ren? Dudfield wrote:
> ...
> > Now I just need to finish off the static file generation code.  It
> > needs a tool which can run every minute or so, which will look for any
> > changes.
>
> Why not write the files when the underlying packages change?
>
> I don't like polling for two reasons:
>
> - New pages are out of date for up to the polling interval.  This is
> especially annoying for someone who uploads a package and wants to be
> able to access it immediately.
>
> - Polling all of the pages to see what's changed doesn't seem
> scalable to me.
>
> ...
>
> > I've also updated the http://wiki.python.org/moin/CheeseShopDev page
> > with some things I noticed when installing the cheeseshop again on my
> > laptop.  Mainly dependencies, and missing config steps.
>
> Thanks!
>
> Jim
>
> --
> Jim Fulton                      mailto:jim at zope.com             Python Powered!
> CTO                             (540) 361-1714                  http://www.python.org
> Zope Corporation        http://www.zope.com             http://www.zope.org
>
>
>
>

From renesd at gmail.com  Sun Jul  8 05:27:53 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Sun, 8 Jul 2007 13:27:53 +1000
Subject: [Catalog-sig] europython cheeseshop sprint? Rolling out changes.
Message-ID: <64ddb72c0707072027p226f7125k3642ef00c5577675@mail.gmail.com>

Hellos,


I'll need to coordinate with someone at some point to implement my
changes... since I don't have access.  I'm at europython, so maybe
that would be a good time to meet up for a little sprint?


Is there anyone with access to the cheeseshop going to europython who
wants to work on implementing these changes?  I don't have subversion
commit access, or access to the server, so I'll need someone else who
does to help me.


Here's the sprint wiki page for sprints:
http://wiki.python.org/moin/EuroPython2007Sprints

I also created a page here:
http://wiki.python.org/moin/EuroPython2007/CheeseshopSprint


We need to decide when to do the sprint too.



Please let me know if you want to join the sprint, and on what day?

What other things do people want to work on at the sprint?

It would be good to set up a different virtual domain so we can test
changes on there without mucking up the normal cheeseshop so much.  It
might be best if I set it up on a separate server for testing, since
apache will have to be restarted a lot.

Since there aren't really any tests for the cheeseshop, should I start
adding some?  If so with which tool?  I'd like to make some tests to
see if the dymanic, or static files are being served - depending if
the user is authorized or not.  I'd also like to

These tests can also serve as monitoring tools - to answer this
question - 'is the cheeseshop still working?'

From renesd at gmail.com  Sun Jul  8 06:48:45 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Sun, 8 Jul 2007 14:48:45 +1000
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<F4E8DC67-7ECB-4F0E-BC40-943926334742@zope.com>
	<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
Message-ID: <64ddb72c0707072148n1a593f5au734a8d22910be16a@mail.gmail.com>

Hi,

here's the start of the static file generator.  It just works on one
web path, and one fileout at a time so far.  It doesn't figure out the
correct path to put the file, or check to see if there are any
changes.

http://rene.f0o.com/~rene/stuff/pypi/pypi-static-generation.py

# here is like looking at the http://cheeseshop.python.org/pypi/pygame url
python pypi-static-generation.py -create_single /pypi/pygame /tmp/pygame.html

It uses the webui.py code, so that there will not be any repeating of
code.  It does this in a similar manner to how the pypi.py pypi.cgi
and pypi.fcgi codes works.  That is by making its implementation of
the RequestWrapper class.



I thought I'd just keep posting my changes to the mailing list as I
go... so there's some history of changes - and so people can have a
look/review if they want.  If that annoys people I'll stop sending to
the list.



Next up I'm going to put a few functions into store.py.  Ones to check
if a release has changed since a given date.  Also one to see if any
changes at all have happened since a given date.

I'll also add some onChange type functions for releases.  That will be
where all of the code can go for stuff that happens on a change to
releases etc.


cheers,



On 7/8/07, Ren? Dudfield <renesd at gmail.com> wrote:
> Hello,
>
> Cool, ok.  Let's start with event based updating of the static files.
>
>
> I need to make this tool in this way anyway though.  But we can either
> set it up to work with polling, or event based.  We can start with
> event based and switch to polling later if needed.
>
> Since none of the files exists at the moment, the tool will be
> needed to generate them initially.  Also if templates change, or the
> database changes - then the static pages may need regenerating.
>
>
> Polling is just one sql statement to see if something has changed.
> You do this once, no matter how many things have changed.  It's a
> really quick, operation if nothing has changed.
>
> Polling ends up being faster if you are constantly having to do things
> all the time anyway.  It's what network drivers do these days because
> they realise that there are a constant stream of events(interupts)
> anyway - so might as well deal with them at a fixed interval.
>
> Logged in users will not see the static file anyway - since they are
> logged in, they get to see the dynamically generated stuff.
>
> Imagine this case:
> 2-3 users are updating their packages, at a similar time.  The main
> index then gets regenerated 3 times, rather than once.  The more
> people who are changing things the more this method works.  If there
> are 20 people changing things at the same time, then there is still
> only one update of the main index page.  However since the cheeseshop
> only gets updated about 6 times daily, event based is probably better
> for the moment.
>
>
> Anyway... I'm just making the tool which can be used on demand, or at
> regular timings.
>
>
> Cheers,
>
>
>
> On 7/8/07, Jim Fulton <jim at zope.com> wrote:
> >
> > On Jul 7, 2007, at 12:24 AM, Ren? Dudfield wrote:
> > ...
> > > Now I just need to finish off the static file generation code.  It
> > > needs a tool which can run every minute or so, which will look for any
> > > changes.
> >
> > Why not write the files when the underlying packages change?
> >
> > I don't like polling for two reasons:
> >
> > - New pages are out of date for up to the polling interval.  This is
> > especially annoying for someone who uploads a package and wants to be
> > able to access it immediately.
> >
> > - Polling all of the pages to see what's changed doesn't seem
> > scalable to me.
> >
> > ...
> >
> > > I've also updated the http://wiki.python.org/moin/CheeseShopDev page
> > > with some things I noticed when installing the cheeseshop again on my
> > > laptop.  Mainly dependencies, and missing config steps.
> >
> > Thanks!
> >
> > Jim
> >
> > --
> > Jim Fulton                      mailto:jim at zope.com             Python Powered!
> > CTO                             (540) 361-1714                  http://www.python.org
> > Zope Corporation        http://www.zope.com             http://www.zope.org
> >
> >
> >
> >
>

From martin at v.loewis.de  Sun Jul  8 07:19:33 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 08 Jul 2007 07:19:33 +0200
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<F4E8DC67-7ECB-4F0E-BC40-943926334742@zope.com>
	<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
Message-ID: <469073E5.6010201@v.loewis.de>

> Polling is just one sql statement to see if something has changed.

It's not good enough if something has changed - one would also need
to know what precisely has changed, or else you would need to
regenerate everything.

> Polling ends up being faster if you are constantly having to do things
> all the time anyway.

Maybe (I don't fully understand what you try to say).

However, the cheeseshop does not change very often, so you don't
have to do things all the time anyway. If it was, caching would have
no advantage.

> 2-3 users are updating their packages, at a similar time.  The main
> index then gets regenerated 3 times, rather than once.

[Not sure what page precisely you are referring to as "the main index".
I'll assume you talk about the home page]

On July 7 (yesterday), there were 54 changes; the day before, there were
37. Of these, it is typical that multiple changes to the same package
happen within a few seconds, and then no changes happen for many
minutes; often not a single change within an hour.

It very rarely happens that there are 3 users simultaneously updating
their packages.

Regenerating the main index 3 times is very fast. Depending on how
precisely you prevent concurrent updates, and depending on how
similar the times are, the three users may not trigger three updates,
but only two, if the first update is still running when the second
and third one is attempted.

Regards,
Martin

From martin at v.loewis.de  Sun Jul  8 07:29:36 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 08 Jul 2007 07:29:36 +0200
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <64ddb72c0707072148n1a593f5au734a8d22910be16a@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<F4E8DC67-7ECB-4F0E-BC40-943926334742@zope.com>	<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
	<64ddb72c0707072148n1a593f5au734a8d22910be16a@mail.gmail.com>
Message-ID: <46907640.3010408@v.loewis.de>

> Next up I'm going to put a few functions into store.py.  Ones to check
> if a release has changed since a given date.  Also one to see if any
> changes at all have happened since a given date.

Is this really necessary? I think it would be sufficient to have a table
of name,version pairs that list the releases that have changed. This
table is filled on modification, and cleared by the regeneration.

Regards,
Martin

From renesd at gmail.com  Sun Jul  8 07:36:21 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Sun, 8 Jul 2007 15:36:21 +1000
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <46907640.3010408@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<F4E8DC67-7ECB-4F0E-BC40-943926334742@zope.com>
	<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
	<64ddb72c0707072148n1a593f5au734a8d22910be16a@mail.gmail.com>
	<46907640.3010408@v.loewis.de>
Message-ID: <64ddb72c0707072236x6c800515sc8869e31334bd359@mail.gmail.com>

hello,

It's less work to just look up to see when the last change was.
Rather than make another table and store it - duplicating the data.

Cheers,


On 7/8/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > Next up I'm going to put a few functions into store.py.  Ones to check
> > if a release has changed since a given date.  Also one to see if any
> > changes at all have happened since a given date.
>
> Is this really necessary? I think it would be sufficient to have a table
> of name,version pairs that list the releases that have changed. This
> table is filled on modification, and cleared by the regeneration.
>
> Regards,
> Martin
>

From renesd at gmail.com  Sun Jul  8 09:46:18 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Sun, 8 Jul 2007 17:46:18 +1000
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <64ddb72c0707072236x6c800515sc8869e31334bd359@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<F4E8DC67-7ECB-4F0E-BC40-943926334742@zope.com>
	<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
	<64ddb72c0707072148n1a593f5au734a8d22910be16a@mail.gmail.com>
	<46907640.3010408@v.loewis.de>
	<64ddb72c0707072236x6c800515sc8869e31334bd359@mail.gmail.com>
Message-ID: <64ddb72c0707080046j4c1a2566s7cf6ae5cba0ad9c6@mail.gmail.com>

Hi,

here's another update:

http://rene.f0o.com/~rene/stuff/pypi/pypi-static-generation.py

Now you can also create all of the releases listed on the "/pypi/" url.
python pypi-static-generation.py -create_all

It still doesn't do date checking yet.  I'll probably get around to
that tomorrow.

so it creates these files and directories:
/pypi/Pygame/index.html
/pypi/Pygame/1.7.1/index.html

So these urls can use the static files:
/pypi/Pygame/
/pypi/Pygame
/pypi/Pygame/1.7.1
/pypi/Pygame/1.7.1/


It took about 20 minutes to generate all of them on my Ye Olde p3
256MB ram, laptop HD computer.



Cheers,



On 7/8/07, Ren? Dudfield <renesd at gmail.com> wrote:
> hello,
>
> It's less work to just look up to see when the last change was.
> Rather than make another table and store it - duplicating the data.
>
> Cheers,
>
>
> On 7/8/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > > Next up I'm going to put a few functions into store.py.  Ones to check
> > > if a release has changed since a given date.  Also one to see if any
> > > changes at all have happened since a given date.
> >
> > Is this really necessary? I think it would be sufficient to have a table
> > of name,version pairs that list the releases that have changed. This
> > table is filled on modification, and cleared by the regeneration.
> >
> > Regards,
> > Martin
> >
>

From jim at zope.com  Sun Jul  8 14:14:27 2007
From: jim at zope.com (Jim Fulton)
Date: Sun, 8 Jul 2007 08:14:27 -0400
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <468FF69B.2090503@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
Message-ID: <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>


On Jul 7, 2007, at 4:24 PM, Martin v. L?wis wrote:

> Jim Fulton schrieb:
>> ...
>>> I'm quite skeptical on caching in general (even about the static  
>>> page
>>> generation). It *should* be possible to make it fast enough so that
>>> it doesn't need caching.
>>
>> Sure, with more hardware than we want to afford.
>
> So you are saying it's not fast enough already?

Uh, yeah. That's what this whole thread has been about. *Maybe* all  
your efforts will make it fast enough.  I'm skeptical though. Also  
understand that now that we're using the cheeseshop to support  
automated builds, the load will increase a lot over time.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From martin at v.loewis.de  Sun Jul  8 18:07:27 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 08 Jul 2007 18:07:27 +0200
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
Message-ID: <46910BBF.3010308@v.loewis.de>

>> So you are saying it's not fast enough already?
> 
> Uh, yeah.

Can you please be more precise, then? What kind of operation are
you performing, how long does it take, and how long should it
take so that you would consider it fast enough?

It's difficult to implement a system if the requirements are
unknown to those implementing it.

Regards,
Martin

From pje at telecommunity.com  Sun Jul  8 19:27:56 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 08 Jul 2007 13:27:56 -0400
Subject: [Catalog-sig] start on static generation,
 and caching -  apache config.
In-Reply-To: <6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
Message-ID: <20070708172544.8D2763A404D@sparrow.telecommunity.com>

At 01:48 PM 7/7/2007 -0400, Jim Fulton wrote:

>On Jul 7, 2007, at 12:43 PM, Martin v. L?wis wrote:
>...
> > I'm quite skeptical on caching in general (even about the static page
> > generation). It *should* be possible to make it fast enough so that
> > it doesn't need caching.
>
>Sure, with more hardware than we want to afford.
>
> > I consider caching a work-around, not a
> > solution - and one with severe drawbacks.
>
>The pages we're talking about are static.  They change at well-known
>times. IMO, It's crazy to serve static content dynamically when it's
>easy to serve it statically.

If they're effectively static, why can't Apache cache 
them?  Shouldn't we be able to simply add Last-Modified/If-Modified 
support to the PyPI output, and enable Apache's disk caching for 
non-logged-in users?

That is, as long as there is a quick last-modified-time query for a 
package, we can use those to process the If-Modified header.  The 
modification time could even be memcached, so as not to need a 
database hit 99% of the time.

While that's not necessarily as fast as static page generation, it's 
a lot less complex to get right, and it saves the main piece of CPU 
load: i.e., doing SQL queries and actually generating the page.

Pages that pertain to more than one package might be a bit more 
complex to do this on, but if I understand correctly it's mainly the 
package-specific pages we're concerned with here, correct?  Even so, 
it's possible to have any updates also update a global "something's 
changed" time, and use that time as the Last-Modified of those pages.


From martin at v.loewis.de  Sun Jul  8 19:37:24 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 08 Jul 2007 19:37:24 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -  apache config.
In-Reply-To: <20070708172544.8D2763A404D@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<20070708172544.8D2763A404D@sparrow.telecommunity.com>
Message-ID: <469120D4.60909@v.loewis.de>

> If they're effectively static, why can't Apache cache them?

That's easy to answer: nobody told Apache to do that
(and I don't know how to tell it to).

Ren?'s approach currently is to generate the files explicitly
on disk, and then have Apache return them always from disk.

> Shouldn't
> we be able to simply add Last-Modified/If-Modified support to the PyPI
> output, and enable Apache's disk caching for non-logged-in users?

How precisely would that work? I.e. what software should put what
header into what place, and how would the cache then find out that
the real data have changed?

> While that's not necessarily as fast as static page generation, it's a
> lot less complex to get right, and it saves the main piece of CPU load:
> i.e., doing SQL queries and actually generating the page.

I'm not convinced yet that this is where the time is spent (seeing
actual profiling data would convince me). I have learned to never
ever guess what precisely is consuming cycles in a piece of software.

> Pages that pertain to more than one package might be a bit more complex
> to do this on, but if I understand correctly it's mainly the
> package-specific pages we're concerned with here, correct?

I'm not convinced of that, either.

Regards,
Martin

From renesd at gmail.com  Sun Jul  8 19:47:17 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Mon, 9 Jul 2007 03:47:17 +1000
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <20070708172544.8D2763A404D@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<20070708172544.8D2763A404D@sparrow.telecommunity.com>
Message-ID: <64ddb72c0707081047i1f4209e0j1584c1c2d6863bc5@mail.gmail.com>

Hi,

turning on caching is the plan as well, but after the static files.
See my earlier emails on the subject.

However static pages have their uses too, and are a bit faster than
the cached ones.


On 7/9/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 01:48 PM 7/7/2007 -0400, Jim Fulton wrote:
>
> >On Jul 7, 2007, at 12:43 PM, Martin v. L?wis wrote:
> >...
> > > I'm quite skeptical on caching in general (even about the static page
> > > generation). It *should* be possible to make it fast enough so that
> > > it doesn't need caching.
> >
> >Sure, with more hardware than we want to afford.
> >
> > > I consider caching a work-around, not a
> > > solution - and one with severe drawbacks.
> >
> >The pages we're talking about are static.  They change at well-known
> >times. IMO, It's crazy to serve static content dynamically when it's
> >easy to serve it statically.
>
> If they're effectively static, why can't Apache cache
> them?  Shouldn't we be able to simply add Last-Modified/If-Modified
> support to the PyPI output, and enable Apache's disk caching for
> non-logged-in users?
>
> That is, as long as there is a quick last-modified-time query for a
> package, we can use those to process the If-Modified header.  The
> modification time could even be memcached, so as not to need a
> database hit 99% of the time.
>
> While that's not necessarily as fast as static page generation, it's
> a lot less complex to get right, and it saves the main piece of CPU
> load: i.e., doing SQL queries and actually generating the page.
>
> Pages that pertain to more than one package might be a bit more
> complex to do this on, but if I understand correctly it's mainly the
> package-specific pages we're concerned with here, correct?  Even so,
> it's possible to have any updates also update a global "something's
> changed" time, and use that time as the Last-Modified of those pages.
>
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>

From renesd at gmail.com  Sun Jul  8 19:50:07 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Mon, 9 Jul 2007 03:50:07 +1000
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <469120D4.60909@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<20070708172544.8D2763A404D@sparrow.telecommunity.com>
	<469120D4.60909@v.loewis.de>
Message-ID: <64ddb72c0707081050l55c8beakbc241e5ac94ed7d7@mail.gmail.com>

On 7/9/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > If they're effectively static, why can't Apache cache them?
>
> That's easy to answer: nobody told Apache to do that
> (and I don't know how to tell it to).
>
> Ren?'s approach currently is to generate the files explicitly
> on disk, and then have Apache return them always from disk.

Yeah, have apache return from disk if not logged in.  Also if the
static file is not there, then it generates the page dynamically.

From pje at telecommunity.com  Sun Jul  8 21:33:36 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 08 Jul 2007 15:33:36 -0400
Subject: [Catalog-sig] start on static generation,
 and caching -   apache config.
In-Reply-To: <469120D4.60909@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<20070708172544.8D2763A404D@sparrow.telecommunity.com>
	<469120D4.60909@v.loewis.de>
Message-ID: <20070708193123.CCB803A404D@sparrow.telecommunity.com>

At 07:37 PM 7/8/2007 +0200, Martin v. L?wis wrote:
> > If they're effectively static, why can't Apache cache them?
>
>That's easy to answer: nobody told Apache to do that
>(and I don't know how to tell it to).
>
>Ren?'s approach currently is to generate the files explicitly
>on disk, and then have Apache return them always from disk.
>
> > Shouldn't
> > we be able to simply add Last-Modified/If-Modified support to the PyPI
> > output, and enable Apache's disk caching for non-logged-in users?
>
>How precisely would that work? I.e. what software should put what
>header into what place, and how would the cache then find out that
>the real data have changed?

I was under the impression that when Apache caching is enabled, it 
can add an If-Modified-Since header to incoming requests, and in the 
event that the dynamic content hasn't changed, use its cached version 
of the response.  I am not an expert on this, however.

If it does do this, then PyPI would check for an If-Modified-Since 
header and compare it to the modified date for the page, and return a 
"not changed" response if appropriate.


> > While that's not necessarily as fast as static page generation, it's a
> > lot less complex to get right, and it saves the main piece of CPU load:
> > i.e., doing SQL queries and actually generating the page.
>
>I'm not convinced yet that this is where the time is spent (seeing
>actual profiling data would convince me).

I thought Rene' had done such profiling, as he said it was the 
templates that were taking most of the CPU.


> > Pages that pertain to more than one package might be a bit more complex
> > to do this on, but if I understand correctly it's mainly the
> > package-specific pages we're concerned with here, correct?
>
>I'm not convinced of that, either.

Well, I thought those were the ones we were caching.

It may be that I'm making too many assumptions, but if those 
assumptions are correct, then the whole thing gets a lot easier to 
prove correct, compared to a static cache, due to fewer moving 
parts.  If most CPU time is spent rendering package-specific pages, 
then this approach would fix the problem using the fewest changed 
parts and extra code to maintain.


From martin at v.loewis.de  Sun Jul  8 21:34:00 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 08 Jul 2007 21:34:00 +0200
Subject: [Catalog-sig] ZPT template caching
Message-ID: <46913C28.4060903@v.loewis.de>

I just added template caching to PyPI: rather than parsing
a page template on each request, it caches the templates, and
later renders a pre-parsed one. According to my measurements,
this should reduce the number of Python function calls needed
to render a page noticably.

As a side effect, Apache needs to be restarted when a template
changes (this was already the case for code changes).

Regards,
Martin

From martin at v.loewis.de  Sun Jul  8 22:00:44 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 08 Jul 2007 22:00:44 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -   apache config.
In-Reply-To: <20070708193123.CCB803A404D@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<20070708172544.8D2763A404D@sparrow.telecommunity.com>
	<469120D4.60909@v.loewis.de>
	<20070708193123.CCB803A404D@sparrow.telecommunity.com>
Message-ID: <4691426C.7030501@v.loewis.de>

> I was under the impression that when Apache caching is enabled, it can
> add an If-Modified-Since header to incoming requests, and in the event
> that the dynamic content hasn't changed, use its cached version of the
> response.  I am not an expert on this, however.

Where would it add that? The (F)CGI script doesn't see any headers,
except for those communicated in environment variables. AFAICT,
there is non for if-modified-since.

If you were thinking of mod_cache: it will expire entries after
CacheDefaultExpire (default 1h), unless an Expires or Last-Modified
header is in the original response. In the latter case,
CacheLastModifiedFactor is used to determine an expiry period
(default 10% since last-modified).

>> I'm not convinced yet that this is where the time is spent (seeing
>> actual profiling data would convince me).
> 
> I thought Rene' had done such profiling, as he said it was the templates
> that were taking most of the CPU.

I saw that he said that its taking most of the CPU, however, he didn't
say he did profiling.

I now did, and found that the parsing of the templates takes some time,
so it now caches the parsed templates.

>> > Pages that pertain to more than one package might be a bit more complex
>> > to do this on, but if I understand correctly it's mainly the
>> > package-specific pages we're concerned with here, correct?
>>
>> I'm not convinced of that, either.
> 
> Well, I thought those were the ones we were caching.

Not "were caching", but "going to cache". As I said before, I'm
unconvinced that this is were the load goes; as a consequence,
I'm unconvinced that generating static pages will improve things.

Of course, if Rene completes this project, and the static
pages don't actually break anything, it shouldn't hurt to use them;
then we will see what the saving is (there surely will be *some*
saving, and it might be that those who complain about the performance
most will see a performance increase assuming that they are primarily
interested in the static pages).

> It may be that I'm making too many assumptions, but if those assumptions
> are correct, then the whole thing gets a lot easier to prove correct,
> compared to a static cache, due to fewer moving parts.  If most CPU time
> is spent rendering package-specific pages, then this approach would fix
> the problem using the fewest changed parts and extra code to maintain.

My biggest concern is whether there can be a reliable computation of
"has this changed". If that predicate gives an incorrect response,
it doesn't matter much whether Apache does its own caching, or whether
the static page fail to be regenerated.

Regards,
Martin


From jafo at tummy.com  Mon Jul  9 06:50:38 2007
From: jafo at tummy.com (Sean Reifschneider)
Date: Sun, 8 Jul 2007 22:50:38 -0600
Subject: [Catalog-sig] ZPT template caching
In-Reply-To: <46913C28.4060903@v.loewis.de>
References: <46913C28.4060903@v.loewis.de>
Message-ID: <20070709045038.GA12464@tummy.com>

On Sun, Jul 08, 2007 at 09:34:00PM +0200, "Martin v. L?wis" wrote:
>As a side effect, Apache needs to be restarted when a template
>changes (this was already the case for code changes).

The way I cache our site, I put the cache into memcached, so that the cache
is shared among all apaches, ages out old stuff, and when I update
something I just tell memcached to invalidate everything in it's cache, no
Apache restart necessary.  I *DO* need to restart it if I make code
changes, but not template changes.

Thanks,
Sean
-- 
 If not actually disgruntled, he was far from being gruntled.
                 -- P. G. Wodehouse
Sean Reifschneider, Member of Technical Staff <jafo at tummy.com>
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability


From martin at v.loewis.de  Mon Jul  9 07:08:15 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 09 Jul 2007 07:08:15 +0200
Subject: [Catalog-sig] ZPT template caching
In-Reply-To: <20070709045038.GA12464@tummy.com>
References: <46913C28.4060903@v.loewis.de> <20070709045038.GA12464@tummy.com>
Message-ID: <4691C2BF.1060901@v.loewis.de>

> The way I cache our site, I put the cache into memcached, so that the cache
> is shared among all apaches, ages out old stuff, and when I update
> something I just tell memcached to invalidate everything in it's cache, no
> Apache restart necessary.  I *DO* need to restart it if I make code
> changes, but not template changes.

How can I put parsed zope templates into memcached?

Regards,
Martin

From jafo at tummy.com  Mon Jul  9 07:25:57 2007
From: jafo at tummy.com (Sean Reifschneider)
Date: Sun, 8 Jul 2007 23:25:57 -0600
Subject: [Catalog-sig] ZPT template caching
In-Reply-To: <4691C2BF.1060901@v.loewis.de>
References: <46913C28.4060903@v.loewis.de> <20070709045038.GA12464@tummy.com>
	<4691C2BF.1060901@v.loewis.de>
Message-ID: <20070709052557.GD5041@tummy.com>

On Mon, Jul 09, 2007 at 07:08:15AM +0200, "Martin v. L?wis" wrote:
>How can I put parsed zope templates into memcached?

I have no idea.  I do it by caching the results, which for my application
is all I really care about and don't vary from request to request unless
the data or template has changed, or it's a different day.

Sean
-- 
 Examine what is said, not who speaks.  (Arabian Proverb)
Sean Reifschneider, Member of Technical Staff <jafo at tummy.com>
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability
      Back off man. I'm a scientist.   http://HackingSociety.org/


From jim at zope.com  Mon Jul  9 15:49:32 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 9 Jul 2007 09:49:32 -0400
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<F4E8DC67-7ECB-4F0E-BC40-943926334742@zope.com>
	<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
Message-ID: <18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com>

What Martin said :), and:

On Jul 7, 2007, at 11:14 PM, Ren? Dudfield wrote:
...
> Logged in users will not see the static file anyway - since they are
> logged in, they get to see the dynamically generated stuff.

Here's a common use case:

- A user uploads a new release

- They then use setuptools to install the release from PyPI.   
setuptools will not present their credentials and will therefore  
behave like a logged in user.  It will see and install an older  
version of the package.

This will be very mysterious and annoying to the user that just  
uploaded the release.


> Imagine this case:
> 2-3 users are updating their packages, at a similar time.  The main
> index then gets regenerated 3 times, rather than once.

Who cares.  That's one page that we get dynamically now.


>   The more
> people who are changing things the more this method works.  If there
> are 20 people changing things at the same time, then there is still
> only one update of the main index page.  However since the cheeseshop
> only gets updated about 6 times daily, event based is probably better
> for the moment.

Yup.

> Anyway... I'm just making the tool which can be used on demand, or at
> regular timings.

I wonder if we are talking about the same thing here.  I fear not.   
With event based update, you should only update the pages that need  
to be updated, at worst, this should be the pages for the project  
being updated plus http://www.python.org/pypi/.  The software needed  
for this would be very different than the software that would build  
the static pages initially or update all if a template has changed.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Mon Jul  9 16:09:37 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 9 Jul 2007 10:09:37 -0400
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <46910BBF.3010308@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
Message-ID: <A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>


On Jul 8, 2007, at 12:07 PM, Martin v. L?wis wrote:

>>> So you are saying it's not fast enough already?
>>
>> Uh, yeah.
>
> Can you please be more precise, then? What kind of operation are
> you performing,

I'm using setuptools.  Sertuptools looks at package pages (e.g.  
http://www.python.org/pypi/foobar), it looks at:
http://www.python.org/pypi/ and it doenloads distributions. (AFAICT,  
the later is done dynamically too, which
is especially insane.)

> how long does it take,

Lately, it's has often taken minutes.  This has been the major  
problem.  At the best of times. well, I don't know when those are. :)

ATM, requests for http://www.python.org/pypi/zc.buildout takes about  
1/3 second.  Requests for http://cheeseshop.python.org/packages/2.5/z/ 
zc.buildout/zc.buildout-1.0.0b28-py2.5.egg take about 2.5 seconds.  
Requests for http://www.python.org/pypi/ take about 10 seconds.

I would say that these times are too long.


> and how long should it
> take so that you would consider it fast enough?

IMO, it needs to be much much faster.  If we were serving pages  
staticially, we would be able to serve thousands of requests per  
second.  There's nothing about this application that would make doing  
that hard.

> It's difficult to implement a system if the requirements are
> unknown to those implementing it.

I'm sorry, I've been talking about setuptools all along.  I thought  
the use case was understood.  Also, I thought it was pretty obvious  
that the performance we've been seeing lately is totally  
unacceptable.  It's hard to pinpoint exactly what the acceptable  
performance will be, in part because, we we do better, demand will  
increase.  Note that, as it is now, demand is possibly decreasing  
because people are building their own indexes.

If this was an application that had to be served dynamically (and of  
course, parts of it are), then it would be much more interesting to  
discuss targets for dynamic delivery.  The performance-critical parts  
of this application -- the pages that setuptools uses, can readily be  
served statically, so it makes no sense not to do so.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Mon Jul  9 16:21:23 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 9 Jul 2007 10:21:23 -0400
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <20070708172544.8D2763A404D@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<20070708172544.8D2763A404D@sparrow.telecommunity.com>
Message-ID: <437CFE1D-125A-4856-936E-27FC688B57BA@zope.com>


On Jul 8, 2007, at 1:27 PM, Phillip J. Eby wrote:

> At 01:48 PM 7/7/2007 -0400, Jim Fulton wrote:
>
>> On Jul 7, 2007, at 12:43 PM, Martin v. L?wis wrote:
>> ...
>> > I'm quite skeptical on caching in general (even about the static  
>> page
>> > generation). It *should* be possible to make it fast enough so that
>> > it doesn't need caching.
>>
>> Sure, with more hardware than we want to afford.
>>
>> > I consider caching a work-around, not a
>> > solution - and one with severe drawbacks.
>>
>> The pages we're talking about are static.  They change at well-known
>> times. IMO, It's crazy to serve static content dynamically when it's
>> easy to serve it statically.
>
> If they're effectively static, why can't Apache cache them?   
> Shouldn't we be able to simply add Last-Modified/If-Modified  
> support to the PyPI output, and enable Apache's disk caching for  
> non-logged-in users?

When caching something, you typically specify a age before you start  
checking. That means that content would be stale for that period.   
Sometimes, that is both acceptable and necessary.  In any case,  
dynamic servers typically take just as long to handle an If-Modified  
or Last-Modified request than they do to handle a regular request. It  
would be just as complicated, if not more so, to get the cheeseshop  
software to do this properly than it would to just bake.



> That is, as long as there is a quick last-modified-time query for a  
> package, we can use those to process the If-Modified header.  The  
> modification time could even be memcached, so as not to need a  
> database hit 99% of the time.

No, it can't be cached.  What would you do to make sure that cache  
wasn't stale.

> While that's not necessarily as fast as static page generation,  
> it's a lot less complex to get right, and it saves the main piece  
> of CPU load: i.e., doing SQL queries and actually generating the page.

It is really easy to get static page generation right for an  
application this simple.  YOu know when pages are invalidated.  The  
page relationships are not at all complicated here.


> Pages that pertain to more than one package might be a bit more  
> complex to do this on, but if I understand correctly it's mainly  
> the package-specific pages we're concerned with here, correct?

Yes, and http://www.python.org/pypi/

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From renesd at gmail.com  Mon Jul  9 17:13:48 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Mon, 9 Jul 2007 19:13:48 +0400
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<F4E8DC67-7ECB-4F0E-BC40-943926334742@zope.com>
	<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
	<18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com>
Message-ID: <64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com>

Hello Jim,

I double+ agree we should update on change.

On 7/9/07, Jim Fulton <jim at zope.com> wrote:
> Here's a common use case:
>
> - A user uploads a new release
>
> - They then use setuptools to install the release from PyPI.
> setuptools will not present their credentials and will therefore
> behave like a logged in user.  It will see and install an older
> version of the package.
>

You mean it will behave like someone *not* logged in right?  Either
way they should always get the latest change.

The way to do this atomically, so not one can possibly get an old
page, the static file will be removed as the change is committed.
Then everyone gets the latest change right away - as soon as the
change has been committed.

> > Anyway... I'm just making the tool which can be used on demand, or at
> > regular timings.
>
> I wonder if we are talking about the same thing here.  I fear not.
> With event based update, you should only update the pages that need
> to be updated, at worst, this should be the pages for the project
> being updated plus http://www.python.org/pypi/.  The software needed
> for this would be very different than the software that would build
> the static pages initially or update all if a template has changed.
>


These are the commands so far:
python pypi-static-generation.py -create_single /pypi/pygame /tmp/pygame.html
python pypi-static-generation.py -create_all


The generation of the main index page would be:
python pypi-static-generation.py -create_single /pypi/
path_to_static_indexpage.html

Then there would be a command to update the single page:
python pypi-static-generation.py -create_single /pypi/Pygame
path_to_static_pygame.html


Ok, that's all for now.  I'll be able to finish it off in a few days
after europython.

Cheers,

From renesd at gmail.com  Mon Jul  9 17:19:52 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Mon, 9 Jul 2007 19:19:52 +0400
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
Message-ID: <64ddb72c0707090819l2382c8cu619a85ec0d3464dc@mail.gmail.com>

On 7/9/07, Jim Fulton <jim at zope.com> wrote:
> ATM, requests for http://www.python.org/pypi/zc.buildout takes about
> 1/3 second.  Requests for http://cheeseshop.python.org/packages/2.5/z/
> zc.buildout/zc.buildout-1.0.0b28-py2.5.egg take about 2.5 seconds.
> Requests for http://www.python.org/pypi/ take about 10 seconds.
>
> I would say that these times are too long.
>

Hi again,

Just a note, the static pages through the mod-rewrite logic goes
pretty quickly.  So both those pages can be served at 1000s of
requests per second.



Cheers,

From jim at zope.com  Mon Jul  9 18:27:08 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 9 Jul 2007 12:27:08 -0400
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<F4E8DC67-7ECB-4F0E-BC40-943926334742@zope.com>
	<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
	<18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com>
	<64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com>
Message-ID: <34EBAA43-50DD-49F0-BAB0-B114DA870C37@zope.com>


On Jul 9, 2007, at 11:13 AM, Ren? Dudfield wrote:

> Hello Jim,
>
> I double+ agree we should update on change.

Yay! :)

> On 7/9/07, Jim Fulton <jim at zope.com> wrote:
>> Here's a common use case:
>>
>> - A user uploads a new release
>>
>> - They then use setuptools to install the release from PyPI.
>> setuptools will not present their credentials and will therefore
>> behave like a logged in user.  It will see and install an older
>> version of the package.
>>
>
> You mean it will behave like someone *not* logged in right?

Right.

>   Either
> way they should always get the latest change.

Yes, if we update the static on change.

I though you were arguing that it didn't matter of cached pages were  
out of date because the person updating the pages would see the  
changes because they'd see uncached pages.

> The way to do this atomically, so not one can possibly get an old
> page, the static file will be removed as the change is committed.
> Then everyone gets the latest change right away - as soon as the
> change has been committed.

Sure.


>
>> > Anyway... I'm just making the tool which can be used on demand,  
>> or at
>> > regular timings.
>>
>> I wonder if we are talking about the same thing here.  I fear not.
>> With event based update, you should only update the pages that need
>> to be updated, at worst, this should be the pages for the project
>> being updated plus http://www.python.org/pypi/.  The software needed
>> for this would be very different than the software that would build
>> the static pages initially or update all if a template has changed.
>>
>
>
> These are the commands so far:
> python pypi-static-generation.py -create_single /pypi/pygame /tmp/ 
> pygame.html
> python pypi-static-generation.py -create_all

Ah, so one script, 2 behaviors. Fair enough.


> The generation of the main index page would be:
> python pypi-static-generation.py -create_single /pypi/
> path_to_static_indexpage.html
>
> Then there would be a command to update the single page:
> python pypi-static-generation.py -create_single /pypi/Pygame
> path_to_static_pygame.html

Shouldn't that be implied by both of the commands above.

I'm a little surprised that you are doing this as an external script,  
as opposed to adding the behavior to the cheeseshop code, but I guess  
it doesn't matter.

> Ok, that's all for now.  I'll be able to finish it off in a few days
> after europython.

Haven't you been able to get anyone to sprint with you on it there?

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From pje at telecommunity.com  Mon Jul  9 18:44:45 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 09 Jul 2007 12:44:45 -0400
Subject: [Catalog-sig] start on static generation,
 and caching -  apache config.
In-Reply-To: <64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.co
 m>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<F4E8DC67-7ECB-4F0E-BC40-943926334742@zope.com>
	<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
	<18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com>
	<64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com>
Message-ID: <20070709164232.95EED3A404D@sparrow.telecommunity.com>

At 07:13 PM 7/9/2007 +0400, Ren? Dudfield wrote:
>The way to do this atomically, so not one can possibly get an old
>page, the static file will be removed as the change is committed.
>Then everyone gets the latest change right away - as soon as the
>change has been committed.

This sounds pretty good...  except that you may need better 
protection against a race condition.  What happens if a page is 
removed *while* it is being regenerated?  PostgreSQL has MVCC for 
read-only transactions, so the static page will be generated against 
old data, unless you have some other locking mechanism used to 
serialize access to the static file, that is shared by both the 
deletion and generating mechanisms.

One possible approach: if the generator writes its files to 
foo/index.html.tmp (opened with exclusive access) and then renames 
them to 'foo/index.html', then the deletion mechanism can attempt to 
*first* remove the .tmp file, then the real file.  Both processes 
must be robust against their renames or unlinks or exclusive open()'s 
failing, but there would then be no possibility of collision.  The 
exclusive open would have to be done at the *start* of write 
processing, however, before any database queries have been 
attempted.  (And their connection must be rolled back at that 
point.)  This ensures that, if a writer succeeds in locking the .tmp 
file, then they are seeing data that is current.

All that having been said, the idea in general sounds good.  If PyPI 
itself simply checked whether the URL it's about to serve is 
cacheable (i.e., has a static location and no user logged in), and if 
so, opened the temp file for exclusive writing, it could just dump 
its generated page out, and rename it at the end if it had been 
successful in acquiring the temp file.

And voila!  No separate caching process, no scheduling, and an always 
perfectly-up-to-date cache.  As soon as a page becomes out of date, 
it gets served dynamically...  but only for as long as it takes to 
serve one copy of that page.  :)

In pseudocode:

     def process_request():
         if no authentication header and URL path is cacheable:
             try:
                 temp = exclusive open cache file with .tmp extension
             except os.error:
                 pass
             else:
                 with stdout redirected to temp:
                     process_request_normally()
                 try:
                     rename(tempfilename, realfilename)
                 except os.error:
                     pass
                 send_browser_contents_of(temp)
                 return

         return process_request_normally()

Here, 'process_request_normally()' should refer to everything that 
PyPI does now, *including database connection rollback or 
commit*.  This will ensure that it's impossible to write stale data 
to the cache.

The deletion process should just do this:

     for name in (cache_path+'.tmp', cache_path):
         try:
             os.unlink(name)
         except os.error:
             pass

after committing the database transaction.


Informal serialization proof:

* Only one process may write to a page's .tmp file at a time

* Either the writer has committed its page write (by renaming the 
.tmp file), or it has not (i.e., rename() is atomic)

* If the writer has *not* committed its page, then the first unlink 
will prevent it from doing so.

* If the writer *has* committed its page, then the second unlink will 
undo this.

* If between the two unlinks operations, another writer appears, that 
writer will be reading current data from the database, because it has 
to acquire exclusive access to the .tmp file before doing a rollback 
and reading the data it will use for writing.

QED, it will be impossible to have stale data in the cache, unless 
the invalidating request fails to attempt its two unlink operations 
during the brief window after its database commit.


From renesd at gmail.com  Mon Jul  9 19:15:19 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Tue, 10 Jul 2007 03:15:19 +1000
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <20070709164232.95EED3A404D@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<F4E8DC67-7ECB-4F0E-BC40-943926334742@zope.com>
	<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
	<18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com>
	<64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com>
	<20070709164232.95EED3A404D@sparrow.telecommunity.com>
Message-ID: <64ddb72c0707091015r7bf80d6bv67e8d1a2c1903fea@mail.gmail.com>

On 7/10/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 07:13 PM 7/9/2007 +0400, Ren? Dudfield wrote:
> >The way to do this atomically, so not one can possibly get an old
> >page, the static file will be removed as the change is committed.
> >Then everyone gets the latest change right away - as soon as the
> >change has been committed.
>
> This sounds pretty good...  except that you may need better
> protection against a race condition.  What happens if a page is
> removed *while* it is being regenerated?  PostgreSQL has MVCC for
> read-only transactions, so the static page will be generated against
> old data, unless you have some other locking mechanism used to
> serialize access to the static file, that is shared by both the
> deletion and generating mechanisms.
>


Hi,

move in linux/unix is atomic.  So the file is generated and then moved
in.  unlink is similar... once you remove it, any processes with that
file open still references the old file.

So no race condition.


def the static generation:
    - generate file in temp file
    - move temp file to place where static file lives.

def the update code:
    - do inserts/updates/deletes.
    - remove static files.
    - commit change.
    - the static generation()

From renesd at gmail.com  Mon Jul  9 19:31:10 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Tue, 10 Jul 2007 03:31:10 +1000
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <34EBAA43-50DD-49F0-BAB0-B114DA870C37@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<F4E8DC67-7ECB-4F0E-BC40-943926334742@zope.com>
	<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
	<18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com>
	<64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com>
	<34EBAA43-50DD-49F0-BAB0-B114DA870C37@zope.com>
Message-ID: <64ddb72c0707091031m1fe5fccai12708e38fb547d79@mail.gmail.com>

No, I haven't found anyone yet.  I'll write it up on the board, and
see if anyone wants to join in tomorrow - or maybe find someone at the
bar tonight.

Where do people report bugs for the cheeseshop/distutils?  Someone was
telling me today that he couldn't get the setup.py to do new releases
anymore.

cu.

On 7/10/07, Jim Fulton <jim at zope.com> wrote:
>
> On Jul 9, 2007, at 11:13 AM, Ren? Dudfield wrote:
>
> > Hello Jim,
> >
> > I double+ agree we should update on change.
>
> Yay! :)
>
> > On 7/9/07, Jim Fulton <jim at zope.com> wrote:
> >> Here's a common use case:
> >>
> >> - A user uploads a new release
> >>
> >> - They then use setuptools to install the release from PyPI.
> >> setuptools will not present their credentials and will therefore
> >> behave like a logged in user.  It will see and install an older
> >> version of the package.
> >>
> >
> > You mean it will behave like someone *not* logged in right?
>
> Right.
>
> >   Either
> > way they should always get the latest change.
>
> Yes, if we update the static on change.
>
> I though you were arguing that it didn't matter of cached pages were
> out of date because the person updating the pages would see the
> changes because they'd see uncached pages.
>
> > The way to do this atomically, so not one can possibly get an old
> > page, the static file will be removed as the change is committed.
> > Then everyone gets the latest change right away - as soon as the
> > change has been committed.
>
> Sure.
>
>
> >
> >> > Anyway... I'm just making the tool which can be used on demand,
> >> or at
> >> > regular timings.
> >>
> >> I wonder if we are talking about the same thing here.  I fear not.
> >> With event based update, you should only update the pages that need
> >> to be updated, at worst, this should be the pages for the project
> >> being updated plus http://www.python.org/pypi/.  The software needed
> >> for this would be very different than the software that would build
> >> the static pages initially or update all if a template has changed.
> >>
> >
> >
> > These are the commands so far:
> > python pypi-static-generation.py -create_single /pypi/pygame /tmp/
> > pygame.html
> > python pypi-static-generation.py -create_all
>
> Ah, so one script, 2 behaviors. Fair enough.
>
>
> > The generation of the main index page would be:
> > python pypi-static-generation.py -create_single /pypi/
> > path_to_static_indexpage.html
> >
> > Then there would be a command to update the single page:
> > python pypi-static-generation.py -create_single /pypi/Pygame
> > path_to_static_pygame.html
>
> Shouldn't that be implied by both of the commands above.
>
> I'm a little surprised that you are doing this as an external script,
> as opposed to adding the behavior to the cheeseshop code, but I guess
> it doesn't matter.
>
> > Ok, that's all for now.  I'll be able to finish it off in a few days
> > after europython.
>
> Haven't you been able to get anyone to sprint with you on it there?
>
> Jim
>
> --
> Jim Fulton                      mailto:jim at zope.com             Python Powered!
> CTO                             (540) 361-1714                  http://www.python.org
> Zope Corporation        http://www.zope.com             http://www.zope.org
>
>
>
>

From pje at telecommunity.com  Mon Jul  9 19:37:56 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 09 Jul 2007 13:37:56 -0400
Subject: [Catalog-sig] start on static generation,
 and caching -  apache config.
In-Reply-To: <64ddb72c0707091015r7bf80d6bv67e8d1a2c1903fea@mail.gmail.co
 m>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<F4E8DC67-7ECB-4F0E-BC40-943926334742@zope.com>
	<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>
	<18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com>
	<64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com>
	<20070709164232.95EED3A404D@sparrow.telecommunity.com>
	<64ddb72c0707091015r7bf80d6bv67e8d1a2c1903fea@mail.gmail.com>
Message-ID: <20070709173543.BC31B3A404D@sparrow.telecommunity.com>

At 03:15 AM 7/10/2007 +1000, Ren? Dudfield wrote:
>def the static generation:
>    - generate file in temp file
>    - move temp file to place where static file lives.
>
>def the update code:
>    - do inserts/updates/deletes.
>    - remove static files.
>    - commit change.
>    - the static generation()

Ah - I was assuming static generation was going to be a separate process.

However, there's still a race condition here, unless you open the 
temp file exclusively before the transaction commits.  If you wait 
until after the transaction is finished, another change could occur 
to the same page after you, but finish its page write *before* you, 
causing you to overwrite it with your move!  You then end up with an 
outdated page that will stick around indefinitely.  (Yes, it's 
unlikely, but it *can* happen, and therefore eventually will.)

So, as in my suggestion, you *still* need an exclusive open of a 
pre-determined tempfile name, prior to transaction commit.  Then, 
such an occurrence is impossible.

By the way, the generate-on-change approach also means you have to do 
a big batch run to pre-generate all the existing static pages; the 
approach I suggested will simply generate them in response to actual 
demand, with no batch processing necessary.  A new PyPI installation 
would just build up its cache as it gets used, getting faster as it goes.


From martin at v.loewis.de  Tue Jul 10 00:16:03 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 10 Jul 2007 00:16:03 +0200
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
Message-ID: <4692B3A3.5030209@v.loewis.de>

> Lately, it's has often taken minutes.  This has been the major problem. 
> At the best of times. well, I don't know when those are. :)
> 
> ATM, requests for http://www.python.org/pypi/zc.buildout takes about 1/3
> second. 

Ok. By "ATM", you mean July 9, 14:09 GMT?

Please take a look at

http://ximinez.python.org/munin/localdomain/localhost.localdomain-load.html

That was the most significant spike in the load today, and I surely
would like to know what was causing it.

> Requests for
> http://cheeseshop.python.org/packages/2.5/z/zc.buildout/zc.buildout-1.0.0b28-py2.5.egg
> take about 2.5 seconds.

That is a static file, not going through PyPI. It's 168kiB, so that
means you download with 67kB/s.

> Requests for http://www.python.org/pypi/ take
> about 10 seconds.

Why does that matter for setuptools? Does setuptools ever look at this
page?

> I would say that these times are too long.

Which of these precisely? Given that the actual file downloads in 2.5s,
why is it important that the access to the page referring to it is 1/3s?

>> and how long should it
>> take so that you would consider it fast enough?
> 
> IMO, it needs to be much much faster.  If we were serving pages
> staticially, we would be able to serve thousands of requests per
> second.  There's nothing about this application that would make doing
> that hard.

I looked at the load preceding your message. Counting 1000 requests
backwards from 14:09, we are at 16:07. So this system receives roughly
1000 requests per minute in its peak load, and it seems to be able to
handle them (although the performance degrades at that point).

Of these requests, 853 came from a single machine (x.y.237.218), which
appears to be an extraordinarily "big" client of PyPI. 45 requests
came from msnbot, 13 from Google, 44 requests from setuptools (from
different machines), and the rest from various web browsers and
crawlers.

Also, there is a significant difference between throughput and latency:
1000 requests per second is a throughput requirement, whereas "faster
than 0.3s" is a latency requirement. They are somewhat unrelated, see
below.

>> It's difficult to implement a system if the requirements are
>> unknown to those implementing it.
> 
> I'm sorry, I've been talking about setuptools all along.  I thought the
> use case was understood.

I understand the use case, I just don't understand the performance
requirements resulting out of it. If it's an automated build, why do
you care if the page download completes in 0.3s or in 0.01s (it won't
be much faster because of network roundtrip times).

> Also, I thought it was pretty obvious that the
> performance we've been seeing lately is totally unacceptable.

Define "lately". I never personally saw "totally unacceptable
performance". Whenever I access the system, it behaves completely
reasonable, much faster than any other web pages.

There were only two instances of "totally unacceptable performance",
which were when the system was overloaded, and thrashing. I have
since fixed these cases; they cannot occur again. So I don't think
it is possible that the current installation shows "totally
unacceptable" performance.

> If this was an application that had to be served dynamically (and of
> course, parts of it are), then it would be much more interesting to
> discuss targets for dynamic delivery.  The performance-critical parts of
> this application -- the pages that setuptools uses, can readily be
> served statically, so it makes no sense not to do so.

Except that somebody needs to implement that, of course.

Regards,
Martin

From martin at v.loewis.de  Tue Jul 10 00:19:58 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 10 Jul 2007 00:19:58 +0200
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<F4E8DC67-7ECB-4F0E-BC40-943926334742@zope.com>	<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>	<18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com>
	<64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com>
Message-ID: <4692B48E.705@v.loewis.de>

> These are the commands so far:
> python pypi-static-generation.py -create_single /pypi/pygame /tmp/pygame.html
> python pypi-static-generation.py -create_all

That also needs -create-single /pypi/pywin32/210

Regards,
Martin


From martin at v.loewis.de  Tue Jul 10 00:37:51 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 10 Jul 2007 00:37:51 +0200
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <64ddb72c0707091015r7bf80d6bv67e8d1a2c1903fea@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<F4E8DC67-7ECB-4F0E-BC40-943926334742@zope.com>	<64ddb72c0707072014n6f3e9d7cre24b41ea09019f47@mail.gmail.com>	<18DACBA6-9ABA-4300-8DDF-EF066025D473@zope.com>	<64ddb72c0707090813m34eff743p5eb19b6b837ed817@mail.gmail.com>	<20070709164232.95EED3A404D@sparrow.telecommunity.com>
	<64ddb72c0707091015r7bf80d6bv67e8d1a2c1903fea@mail.gmail.com>
Message-ID: <4692B8BF.60203@v.loewis.de>

> So no race condition.

What Phillip says: "the update code" has a race condition,
if multiple simultaneous updates occur.

My proposal is still to put a table into Postgres that lists
the pages to regenerate. The (single) update process would
lock this job table, clear it, release the lock, and start
generating; alternatively, multiple update process would each
lock the table, generate, then release the lock.

Regards,
Martin

From pje at telecommunity.com  Tue Jul 10 02:34:26 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 09 Jul 2007 20:34:26 -0400
Subject: [Catalog-sig] start on static generation,
 and caching -  apache config.
In-Reply-To: <4692B3A3.5030209@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
Message-ID: <20070710003214.A2EA83A404D@sparrow.telecommunity.com>

At 12:16 AM 7/10/2007 +0200, Martin v. L?wis wrote:
> > Requests for http://www.python.org/pypi/ take
> > about 10 seconds.
>
>Why does that matter for setuptools? Does setuptools ever look at this
>page?

Yes, in order to find the correct spelling for a package's name.  If 
a user types, say "pylons" when the package is listed on PyPI as 
"Pylons", setuptools looks at the root after the lookup of 
/pypi/pylons fails.  This need could be eliminated if PyPI would 
canonicalize package names case-insensitively, collapsing all 
non-alphanumeric characters (other than '.') to a single '-'.  i.e.:

def safe_name(name):
     """Convert an arbitrary string to a standard distribution name

     Any runs of non-alphanumeric/. characters are replaced with a single '-'.
     """
     return re.sub('[^A-Za-z0-9.]+', '-', name)

A case-insensitive match by safe_name would be ideal, and could also 
be used to prevent users from registering packages whose names differ 
only by case or punctuation.



From martin at v.loewis.de  Tue Jul 10 07:33:46 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 10 Jul 2007 07:33:46 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -  apache config.
In-Reply-To: <20070710003214.A2EA83A404D@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
Message-ID: <46931A3A.5000703@v.loewis.de>

> Yes, in order to find the correct spelling for a package's name.  If a
> user types, say "pylons" when the package is listed on PyPI as "Pylons",
> setuptools looks at the root after the lookup of /pypi/pylons fails. 

I don't understand. How does it help to look at /pypi in this case?
The right spelling of Pylons is not listed there, unless there was
a release of Pylons recently.

If you want to correct the spelling, you need to look at

http://cheeseshop.python.org/pypi?%3Aaction=index

> A case-insensitive match by safe_name would be ideal, and could also be
> used to prevent users from registering packages whose names differ only
> by case or punctuation.

Would it be acceptable to do an HTTP redirect in that case, ie.
redirect /pypi/pylons/0.9.5 to /pypi/Pylons/0.9.5? I would not
want to have multiple URLs to render the same page, in general
(I know it already does that in some cases).

I can see how lower-casing helps; I'm doubtful about replacing
spaces. I.e. why is it better to look for

python-ftp-server-library--pyftpdlib-

than

Python FTP server library (pyftpdlib)

IOW, if you have a mis-spelling of the latter, what are the
chances that it is so misspelled that the safe_name is still
the former? Shouldn't the package owner just correct the
package name, to pyftpdlib, and put the other string into
the summary?

In any case, if it where postgres 8.1 or later, I could simply do

select name from packages where
regexp_replace(lower(name),'[^a-z0-9.]','-')='gnosis-utilities';

to do the lookup; with 7.4, I would have to download all names
and do the safe matching myself.

Regards,
Martin

From martin at v.loewis.de  Tue Jul 10 08:07:15 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 10 Jul 2007 08:07:15 +0200
Subject: [Catalog-sig] Speeding up /pypi
Message-ID: <46932213.6050508@v.loewis.de>

I created a partial index (didn't know such a thing existed until
yesterday) to speed up the computation of the home page:

CREATE INDEX journals_latest_releases ON
  journals(submitted_date, name, version)
  WHERE version IS NOT NULL AND action='new release';

and reworked the query to let postgres actually use that index;
now I can get the Cheeseshop home page as fast as that of
www.python.org (namely, in 0.1s), as measured by

start=time.time();x=urllib.urlopen("http://cheeseshop.python.org/pypi").read();print
time.time()-start

Regards,
Martin


From renesd at gmail.com  Tue Jul 10 10:49:10 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Tue, 10 Jul 2007 18:49:10 +1000
Subject: [Catalog-sig] Speeding up /pypi
In-Reply-To: <46932213.6050508@v.loewis.de>
References: <46932213.6050508@v.loewis.de>
Message-ID: <64ddb72c0707100149k46782c66m214b184447ab667b@mail.gmail.com>

nice one :)

On 7/10/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> I created a partial index (didn't know such a thing existed until
> yesterday) to speed up the computation of the home page:
>
> CREATE INDEX journals_latest_releases ON
>   journals(submitted_date, name, version)
>   WHERE version IS NOT NULL AND action='new release';
>
> and reworked the query to let postgres actually use that index;
> now I can get the Cheeseshop home page as fast as that of
> www.python.org (namely, in 0.1s), as measured by
>
> start=time.time();x=urllib.urlopen("http://cheeseshop.python.org/pypi").read();print
> time.time()-start
>
> Regards,
> Martin
>
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>

From jim at zope.com  Tue Jul 10 15:52:42 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 10 Jul 2007 09:52:42 -0400
Subject: [Catalog-sig] Merge catalog and distutils sigs
Message-ID: <6C0A5EEC-7E01-4C25-BC09-E0B595C8109A@zope.com>


Is there are good reason for the distutils and catalog sigs to be  
separate?  Now, that PyPI is an integral part of the distribution  
system, I find most topics are really of of interested to both sigs,  
and I bet that the overlap between the sigs is significant.

Would anyone object to combining them?

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From pje at telecommunity.com  Tue Jul 10 16:15:05 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 10 Jul 2007 10:15:05 -0400
Subject: [Catalog-sig] start on static generation,
 and caching -   apache config.
In-Reply-To: <46931A3A.5000703@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
	<46931A3A.5000703@v.loewis.de>
Message-ID: <20070710141304.BC6903A40A4@sparrow.telecommunity.com>

At 07:33 AM 7/10/2007 +0200, Martin v. L?wis wrote:
> > Yes, in order to find the correct spelling for a package's name.  If a
> > user types, say "pylons" when the package is listed on PyPI as "Pylons",
> > setuptools looks at the root after the lookup of /pypi/pylons fails.
>
>I don't understand. How does it help to look at /pypi in this case?

It doesn't.  It looks at /pypi/ (note the trailing /) -- which lists 
all packages.


>The right spelling of Pylons is not listed there, unless there was
>a release of Pylons recently.
>
>If you want to correct the spelling, you need to look at
>
>http://cheeseshop.python.org/pypi?%3Aaction=index

Which is also spelled /pypi/ - the advantage of this is that a purely 
static index consisting of Apache directory indexes produces an 
equally useful result for setuptools.


> > A case-insensitive match by safe_name would be ideal, and could also be
> > used to prevent users from registering packages whose names differ only
> > by case or punctuation.
>
>Would it be acceptable to do an HTTP redirect in that case, ie.
>redirect /pypi/pylons/0.9.5 to /pypi/Pylons/0.9.5?

Yes, although setuptoools at the moment looks at /pypi/pylons/ 
(again, with a trailing /) and does not go to individual version 
pages unless the base page contains only links to individual version pages.

It will handle a redirect correctly, as far as interpreting relative 
links on result pages.


>  I would not
>want to have multiple URLs to render the same page, in general
>(I know it already does that in some cases).
>
>I can see how lower-casing helps; I'm doubtful about replacing
>spaces. I.e. why is it better to look for
>
>python-ftp-server-library--pyftpdlib-

That '--' would actually just be one '-'

>than
>
>Python FTP server library (pyftpdlib)

It's not much better, however, there are a lot of packages with 
shorter names for which it does help.  Mainly, though, setuptools 
just uses this for purposes of determining distribution filenames.


>IOW, if you have a mis-spelling of the latter, what are the
>chances that it is so misspelled that the safe_name is still
>the former? Shouldn't the package owner just correct the
>package name, to pyftpdlib, and put the other string into
>the summary?
>
>In any case, if it where postgres 8.1 or later, I could simply do
>
>select name from packages where
>regexp_replace(lower(name),'[^a-z0-9.]','-')='gnosis-utilities';
>
>to do the lookup; with 7.4, I would have to download all names
>and do the safe matching myself.

I think this will work instead:

    select name from packages where name ~* 'gnosis[^a-z0-9.]+utilities'

i.e., replace all '-' in the safe_name() with the appropriate 
regex.  '~*' is the case-insensitive regular expression match 
operator, according to:

    http://www.postgresql.org/docs/7.4/interactive/functions-matching.html

Of course, it may also suffice to do:

    select lower(name) from packages where name like 'gnosis_%utilities'

i.e. replace all '-' in the safe_name with '_%', which is sort of 
like '.+' in a regex.  You would still have to postprocess the result 
to catch the difference between say, "gnosis-utilities" and 
"gnosis3utilities" or some such, but there should be very few such matches.

The "like" query may be easier for postgres to use an index on - an 
expression index on lower(name) would do the trick.  Of course, I'm 
used to trying to optimize much larger databases than PyPI - with 
only a few thousand entries, a non-index query here may be just fine.

In any case, this query should also be used to check for uniqueness 
when adding packages.


From jim at zope.com  Tue Jul 10 16:32:10 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 10 Jul 2007 10:32:10 -0400
Subject: [Catalog-sig] Why so many zc.buildout versions?
In-Reply-To: <46937F10.3070201@weitershausen.de>
References: <46937F10.3070201@weitershausen.de>
Message-ID: <73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com>

You raise a really good point, which is especially relevant in light  
of pypi performance issues and discussions.

I'm copying the distutils and catalog sigs to get some wider  
discussion. I apologize for the cross posting.

I'm beginning to wonder about the strategy that setuptools uses, or  
maybe about the way we are using the index.

It's important to note that there is nothing specific about the  
buildout package here.

It is very important to make multiple versions available to support  
requirements for specific package versions.  It make builds/installs  
repeatable, whether talking about buildout or other systems built on  
setuptools.  When someone has tested and wants to release an  
application built from a collection of distributions, they will want  
to specify those *specific* versions for future builds or installs.   
This means that we need to retain any versions published indefinitely  
in a way that can be found by setuptools.

Currently, the only way to support multiple versions with the  
cheeseshop is to unhide past releases.  This has a fairly severe  
effect on performance.  As the example below shows, setuptools will  
fetch the package page and then fetch the pages for each release.   
That's a lot of requests.  What makes it worse is that the individual  
package pages can be fairly long.  I've gotten in the habit of  
including full documentation on every release page.  For example,  
recent release pages for zc.buildout are around 200K. This is a  
fairly significant amount of data to transfer.  This will certainly  
make the scanning process take a long time for clients. (Obviously,  
if we keep doing things the way we are, I'll need to stop doing that.)

All of this aggravates any performance problems we might have.

Up to now, setuptools has tried hard to use existing systems without  
change. This means that it reuses systems designed primarily for  
people, not software. I think that setuptools rightly took the  
approach it has up to now so that progress could be made without  
making people change other systems.  This was appropriate when  
setuptools was evolving and people were figuring out ways to use it.   
I think it is time to take a step back and think a lot harder about  
how we'd want to structure an index to support setuptools.

IMO, a setuptools-aware index would have a single page for each package:

- The single page would be published in a case-insensitive way. It  
would be nice to find a way to avoid this, or maybe we should use a  
windows-based web server. :)  It would also be served very cheaply,  
for example statically.

- The single page would list links for all available distributions,  
which should include all distributions published.  It would also list  
any other URLs that should be scanned for releases, when releases  
aren't all uploaded to PyPI.

- The single page would contain very little additional information.  
It would be for use by software, not humans.

In addition, the root page with a trailing / would be empty and very  
cheap.

There are a lot of ways we could achieve this pretty cheaply while  
keeping the existing system pretty much as it is.

For example, the current effort to bake static pages could bake these  
pages instead.  We could make the new index available at a different  
URL for people to play with while we worked the kinks out of the  
process.

Of course, those of us who use the cheesehop and setuptools  
extensively can also achieve much of this by changing the way we work.

Thoughts?

Jim

On Jul 10, 2007, at 8:44 AM, Philipp von Weitershausen wrote:

> When easy_installing zc.buildout I realized that the CheeseShop  
> still lists a gazillion old versions of zc.buildout. That makes it  
> take quite some time to install zc.buildout (see below), and I  
> reckon the same sort of check has to happen each time it looks for  
> a new version of that egg...
>
> Is there any reason for having so many old versions around?
>
>
> $ easy_install zc.buildout
> Searching for zc.buildout
> Reading http://cheeseshop.python.org/pypi/zc.buildout/
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b19
> Reading http://svn.zope.org/zc.buildout
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b22
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b23
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b20
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b21
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b26
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b27
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b24
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b25
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b28
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b17
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b16
> Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b18
> Best match: zc.buildout 1.0.0b28
> ...

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Tue Jul 10 16:40:48 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 10 Jul 2007 10:40:48 -0400
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <4692B3A3.5030209@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
Message-ID: <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>


On Jul 9, 2007, at 6:16 PM, Martin v. L?wis wrote:
...
> Ok. By "ATM", you mean July 9, 14:09 GMT?

Whenever I sent the note,

> Please take a look at
>
> http://ximinez.python.org/munin/localdomain/localhost.localdomain- 
> load.html
>
> That was the most significant spike in the load today, and I surely
> would like to know what was causing it.

Maybe someone was trying to mirror pypi because it is too slow. :/  I  
suspect that there is a lot of this going on.

>
>> Requests for
>> http://cheeseshop.python.org/packages/2.5/z/zc.buildout/ 
>> zc.buildout-1.0.0b28-py2.5.egg
>> take about 2.5 seconds.
>
> That is a static file, not going through PyPI. It's 168kiB, so that
> means you download with 67kB/s.

OK. So I guess that is reasonable.  I'll note that in the long term,  
we'll probably want to create mirrors to get better locality and this  
faster downloads and to prevent excessive bandwith consumption for  
python.org.

>
>> Requests for http://www.python.org/pypi/ take
>> about 10 seconds.
>
> Why does that matter for setuptools? Does setuptools ever look at this
> page?

Phillip answered this.


>> I would say that these times are too long.
>
> Which of these precisely? Given that the actual file downloads in  
> 2.5s,
> why is it important that the access to the page referring to it is  
> 1/3s?

I guess all of them except the download.  Really, in the long run, I  
think the download time is too long too.  But that isn't my immediate  
concern.

BTW, the problem is exacerbased by packages like zc.buildout that  
include full documentation in their package pages.  Although even  
packages that don't do that seem to take about a third of a second.

>>> and how long should it
>>> take so that you would consider it fast enough?
>>
>> IMO, it needs to be much much faster.  If we were serving pages
>> staticially, we would be able to serve thousands of requests per
>> second.  There's nothing about this application that would make doing
>> that hard.
>
> I looked at the load preceding your message. Counting 1000 requests
> backwards from 14:09, we are at 16:07. So this system receives roughly
> 1000 requests per minute in its peak load, and it seems to be able to
> handle them (although the performance degrades at that point).

You can expect one of 2 things to happen:

- We'll fix the PyPI performance problems and load will increase  
dramatically, or

- We won't fix the problems and people will create alternate  
indexes.  This is already happening.  If that happens, the load will  
likely still increase, although not as rapidly.

...

>>> It's difficult to implement a system if the requirements are
>>> unknown to those implementing it.
>>
>> I'm sorry, I've been talking about setuptools all along.  I  
>> thought the
>> use case was understood.
>
> I understand the use case, I just don't understand the performance
> requirements resulting out of it. If it's an automated build, why do
> you care if the page download completes in 0.3s or in 0.01s (it won't
> be much faster because of network roundtrip times).

Two reasons:

- People wait for these builds.  A build will usually make *many*  
(tens or hundreds) of requests for pypi checking for new versions of  
software.  If there  are no new versions, which will be the common  
case, then nothing will be downloaded.  I'm most interested in  
speeding up the checking.  Of course, a requests for http:// 
www.python.org/pypi/  will usually be done once per build if any of  
the packages in in the build aren't in pypi  (only once because  
setuptools caches pages internally).  It would be nice to find a way  
to stop doing this.

- If performance degrades, as it has often lately, then the times are  
much longer.  In fact, requests over the last few weeks have often  
timed out, making work grind to a halt.  It't imporant to realize  
that demand will increase substantially, so whatwver we do needs to  
be scalable.

>> Also, I thought it was pretty obvious that the
>> performance we've been seeing lately is totally unacceptable.
>
> Define "lately". I never personally saw "totally unacceptable
> performance". Whenever I access the system, it behaves completely
> reasonable, much faster than any other web pages.

I've seen requests take minutes and time out with proxy errors many  
times over the last few weeks.  We, ZC, and many people we work with  
are at the point of building private indexes to get around the  
horrible performance.


> There were only two instances of "totally unacceptable performance",
> which were when the system was overloaded, and thrashing. I have
> since fixed these cases; they cannot occur again. So I don't think
> it is possible that the current installation shows "totally
> unacceptable" performance.

Maybe others can chime in.

>> If this was an application that had to be served dynamically (and of
>> course, parts of it are), then it would be much more interesting to
>> discuss targets for dynamic delivery.  The performance-critical  
>> parts of
>> this application -- the pages that setuptools uses, can readily be
>> served statically, so it makes no sense not to do so.
>
> Except that somebody needs to implement that, of course.

And happily, someone is.

I've realized this morning, in responding to a note from Philipp von  
Weitershausen that we really should take a step back and think about  
an index to support setuptools, or, failing that, rethink the ways  
we're using PyPI in light of the way setuptools works.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From pje at telecommunity.com  Tue Jul 10 17:56:42 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 10 Jul 2007 11:56:42 -0400
Subject: [Catalog-sig] [Distutils] Why so many zc.buildout versions?
In-Reply-To: <73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com>
References: <46937F10.3070201@weitershausen.de>
	<73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com>
Message-ID: <20070710155535.4D8CC3A40A9@sparrow.telecommunity.com>

At 10:32 AM 7/10/2007 -0400, Jim Fulton wrote:
>Currently, the only way to support multiple versions with the
>cheeseshop is to unhide past releases.  This has a fairly severe
>effect on performance.  As the example below shows, setuptools will
>fetch the package page and then fetch the pages for each release.
>That's a lot of requests.

This could potentially be fixed in setuptools, so that it only looks 
at release pages that match its requirements, in highest-to-lowest 
version order, stopping as soon as a suitable match is found.  That 
would eliminate the current issue -- but only for new versions of 
setuptools.  So I do like your idea better, since it can be made to 
work for already-deployed clients as well.


>I think it is time to take a step back and think a lot harder about
>how we'd want to structure an index to support setuptools.

+1, as long as somebody's willing to build and host the 
thing.  Please see my earlier comments on the Catalog-Sig about this.


>IMO, a setuptools-aware index would have a single page for each package:
>
>- The single page would be published in a case-insensitive way. It
>would be nice to find a way to avoid this, or maybe we should use a
>windows-based web server. :)  It would also be served very cheaply,
>for example statically.

Apache's CheckSpelling directive does case-insensitivity and 
approximate matching.  Combine that with making the directories be 
based on "safe_name" values to begin with, and you should be all set.


>- The single page would list links for all available distributions,
>which should include all distributions published.  It would also list
>any other URLs that should be scanned for releases, when releases
>aren't all uploaded to PyPI.

The piece you're missing here is direct links to other downloads, 
such as "#egg=project-dev" subversion links.  However, if you 
extracted these from all of the relevant PyPI HTML pages, you could 
certainly do that.


>In addition, the root page with a trailing / would be empty and very
>cheap.

As long as the individual package directories are safe_name based, 
this would work.


>There are a lot of ways we could achieve this pretty cheaply while
>keeping the existing system pretty much as it is.

Of course, there are still other reasons to want to improve the 
Cheeseshop's performance, such as search engines and other bots.


>For example, the current effort to bake static pages could bake these
>pages instead.  We could make the new index available at a different
>URL for people to play with while we worked the kinks out of the
>process.

...and then use a User-Agent rewrite rule to redirect setuptools 
clients to the static piece, as soon as we're satisfied that it works.


From jim at zope.com  Tue Jul 10 18:04:01 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 10 Jul 2007 12:04:01 -0400
Subject: [Catalog-sig] [Distutils] Why so many zc.buildout versions?
In-Reply-To: <20070710155535.4D8CC3A40A9@sparrow.telecommunity.com>
References: <46937F10.3070201@weitershausen.de>
	<73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com>
	<20070710155535.4D8CC3A40A9@sparrow.telecommunity.com>
Message-ID: <E25E7A55-D64F-46F6-9781-B87F0FE591CE@zope.com>


On Jul 10, 2007, at 11:56 AM, Phillip J. Eby wrote:

> At 10:32 AM 7/10/2007 -0400, Jim Fulton wrote:
>> Currently, the only way to support multiple versions with the
>> cheeseshop is to unhide past releases.  This has a fairly severe
>> effect on performance.  As the example below shows, setuptools will
>> fetch the package page and then fetch the pages for each release.
>> That's a lot of requests.
>
> This could potentially be fixed in setuptools, so that it only  
> looks at release pages that match its requirements, in highest-to- 
> lowest version order, stopping as soon as a suitable match is  
> found.  That would eliminate the current issue

No, it will mitigate the current issue somewhat, but it will still  
involve multiple requests per package, while a simpler index  
structure would allow a single request per package.

> -- but only for new versions of setuptools.  So I do like your idea  
> better, since it can be made to work for already-deployed clients  
> as well.

Yup.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From martin at v.loewis.de  Tue Jul 10 23:29:14 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 10 Jul 2007 23:29:14 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -   apache config.
In-Reply-To: <20070710141304.BC6903A40A4@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
	<46931A3A.5000703@v.loewis.de>
	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
Message-ID: <4693FA2A.3020107@v.loewis.de>

> It doesn't.  It looks at /pypi/ (note the trailing /) -- which lists all
> packages.

Ah, ok. I keep forgetting that feature.

>> Would it be acceptable to do an HTTP redirect in that case, ie.
>> redirect /pypi/pylons/0.9.5 to /pypi/Pylons/0.9.5?
> 
> Yes, although setuptoools at the moment looks at /pypi/pylons/ (again,
> with a trailing /) and does not go to individual version pages unless
> the base page contains only links to individual version pages.

Right - I meant that to mean that it would redirect /pypi/Pylons/ to
/pypi/pylons/

> I think this will work instead:
> 
>    select name from packages where name ~* 'gnosis[^a-z0-9.]+utilities'

Ok. I was hoping to be able to create an index of safe_names, which
postgres would automatically maintain on updates; the above approach
would always cause a sequential scan (in postgres, not in Python).

Your second approach (using like) might solve that, but there I
dislike having the logic both in Python and in SQL - ideally, only
one of them should do "real" computation (and ideally, it would be
SQL).

On ximinez, your query gets analyzed as

 Seq Scan on packages  (cost=0.00..46.65 rows=1 width=13) (actual
time=0.461..9.367 rows=1 loops=1)
   Filter: (name ~* 'gnosis[^a-z0-9.]+utilities'::text)
 Total runtime: 9.413 ms

Compared to some other queries it performs, that's a cheap one.

> In any case, this query should also be used to check for uniqueness when
> adding packages.

Hmm. I'm somewhat skeptical about setuptools (or any other packaging
infrastructure, say, Debian) establishing rules on what makes a
difference in package names.

Regards,
Martin

From martin at v.loewis.de  Tue Jul 10 23:36:28 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 10 Jul 2007 23:36:28 +0200
Subject: [Catalog-sig] Why so many zc.buildout versions?
In-Reply-To: <73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com>
References: <46937F10.3070201@weitershausen.de>
	<73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com>
Message-ID: <4693FBDC.2060201@v.loewis.de>

> For example, the current effort to bake static pages could bake these  
> pages instead.

Certainly not instead; in addition, if there are volunteers to implement
that.

> We could make the new index available at a different  
> URL for people to play with while we worked the kinks out of the  
> process.

I have been thinking about the same thing. I think it would be good
to have, however, it will surely take some time until all setuptools
implementations learn to use it.

> Of course, those of us who use the cheesehop and setuptools  
> extensively can also achieve much of this by changing the way we work.

Hmm. How about those using them extensively start contributing to
them also?

Regards,
Martin


From martin at v.loewis.de  Tue Jul 10 23:39:28 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 10 Jul 2007 23:39:28 +0200
Subject: [Catalog-sig] [Distutils] Why so many zc.buildout versions?
In-Reply-To: <E25E7A55-D64F-46F6-9781-B87F0FE591CE@zope.com>
References: <46937F10.3070201@weitershausen.de>	<73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com>	<20070710155535.4D8CC3A40A9@sparrow.telecommunity.com>
	<E25E7A55-D64F-46F6-9781-B87F0FE591CE@zope.com>
Message-ID: <4693FC90.9060001@v.loewis.de>

> No, it will mitigate the current issue somewhat, but it will still  
> involve multiple requests per package, while a simpler index  
> structure would allow a single request per package.

I don't understand. If setuptools would always look
/pypi/package/version first, it would immediately find the right
page if that version is indeed stored in the cheeseshop.

Why would that require multiple requests per package?

Regards,
Martin


From martin at v.loewis.de  Tue Jul 10 23:48:04 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 10 Jul 2007 23:48:04 +0200
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
Message-ID: <4693FE94.6090107@v.loewis.de>

>> That was the most significant spike in the load today, and I surely
>> would like to know what was causing it.
> 
> Maybe someone was trying to mirror pypi because it is too slow. :/  I
> suspect that there is a lot of this going on.

In that case, I doubt it. The top client identified itself as
setuptools.

> I've seen requests take minutes and time out with proxy errors many
> times over the last few weeks.  We, ZC, and many people we work with are
> at the point of building private indexes to get around the horrible
> performance.

I still don't understand why you consider this an easier option than
contributing to the existing project. If you invest time to do an
alternative, isn't this more expensive than starting where others
have already contributed?

But if you think that scratches your itches: good luck!

> Maybe others can chime in.

That's also my concern. Nobody else is complaining; AFAICT, there
is just one unhappy user of PyPI.

Regards,
Martin


From jim at zope.com  Tue Jul 10 23:49:43 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 10 Jul 2007 17:49:43 -0400
Subject: [Catalog-sig] Why so many zc.buildout versions?
In-Reply-To: <4693FBDC.2060201@v.loewis.de>
References: <46937F10.3070201@weitershausen.de>
	<73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com>
	<4693FBDC.2060201@v.loewis.de>
Message-ID: <4D7FD5E2-7460-4A48-A1B0-C1247B0A3FB8@zope.com>


On Jul 10, 2007, at 5:36 PM, Martin v. L?wis wrote:

>> For example, the current effort to bake static pages could bake these
>> pages instead.
>
> Certainly not instead; in addition, if there are volunteers to  
> implement
> that.

Sure,

>
>> We could make the new index available at a different
>> URL for people to play with while we worked the kinks out of the
>> process.
>
> I have been thinking about the same thing. I think it would be good
> to have, however, it will surely take some time until all setuptools
> implementations learn to use it.

No, not at all.  You can tell setuptools to use a different index  
than the current one.  For example, this is a command-line option for  
easy_install and a configuration option for buildout.

>> Of course, those of us who use the cheesehop and setuptools
>> extensively can also achieve much of this by changing the way we  
>> work.
>
> Hmm. How about those using them extensively start contributing to
> them also?

I like to think that I am by participating in this discussion.   
Actually changing the cheeseshop software has a very high learning  
curve. I don't think that I can make that kind of time any time  
soon.  I'm very grateful that you and Ren? are doing what you're  
doing.  I also suspect that, given your and Ren?'s activity, it would  
be counter productive for someone else to get involved at that level,  
but maybe I'm wrong about that.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Tue Jul 10 23:55:28 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 10 Jul 2007 17:55:28 -0400
Subject: [Catalog-sig] [Distutils] Why so many zc.buildout versions?
In-Reply-To: <4693FC90.9060001@v.loewis.de>
References: <46937F10.3070201@weitershausen.de>	<73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com>	<20070710155535.4D8CC3A40A9@sparrow.telecommunity.com>
	<E25E7A55-D64F-46F6-9781-B87F0FE591CE@zope.com>
	<4693FC90.9060001@v.loewis.de>
Message-ID: <B697ECFA-56BE-4733-A5D4-24FD423DC26B@zope.com>


On Jul 10, 2007, at 5:39 PM, Martin v. L?wis wrote:

>> No, it will mitigate the current issue somewhat, but it will still
>> involve multiple requests per package, while a simpler index
>> structure would allow a single request per package.
>
> I don't understand. If setuptools would always look
> /pypi/package/version first, it would immediately find the right
> page if that version is indeed stored in the cheeseshop.
>
> Why would that require multiple requests per package?

It usually doesn't have a single required version.  It usually has  
just a package name or a name and a range of versions.  It has to  
scan the package page to find out what versions are available, and  
*then* it can load the release page for the highest version that  
satisfies the requirement.  It can usually read that one page,  
however, there may be additional filtering needed that would cause it  
to search multiple releases.  For example, it might be looking for a  
source distribution, or a platform-specific distribution that isn't  
available for the most recent release.  In any case, the best case is  
that it has to scan the package page to find the most recent release,  
and then scan that release page.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From pje at telecommunity.com  Wed Jul 11 00:18:00 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 10 Jul 2007 18:18:00 -0400
Subject: [Catalog-sig] start on static generation,
 and caching -    apache config.
In-Reply-To: <4693FA2A.3020107@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
	<46931A3A.5000703@v.loewis.de>
	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
	<4693FA2A.3020107@v.loewis.de>
Message-ID: <20070710221547.4A3043A40A4@sparrow.telecommunity.com>

At 11:29 PM 7/10/2007 +0200, Martin v. L?wis wrote:
>Hmm. I'm somewhat skeptical about setuptools (or any other packaging
>infrastructure, say, Debian) establishing rules on what makes a
>difference in package names.

I can certainly understand that.  However, *having* SOME definition 
that's more human-friendly (and cross-platform filename friendly!) 
than "the bytes match exactly", would be very useful to have.

If PyPI had already had one (and I asked about this when I was first 
trying to establish one) I'd have used that, or negotiated a 
compromise if it didn't meet the filename-related requirements.

However, none of the times that I asked about this issue on either 
the catalog-sig nor the distutils-sig did anyone propose any 
alternative canonicalization, nor bring up any objection besides the 
general sort of reservation that you're expressing here - i.e., not 
sure it's a good idea, but not expressing any particular reason it's 
a bad idea.

Note that Windows (and Mac OS under certain circumstances) have 
filename case insensitivity, and have different restrictions about 
what can or can't be in a filename than Unix.  Spaces and other 
punctuation characters can cause problems for shells, even if they're 
theoretically acceptable as filenames.

If you'd like to propose a *different* canonicalization, however, I'm 
certainly willing to consider implementing it in setuptools, if it 
can be done.  However, as I said, nobody has proposed anything else, 
but it would be nice to resolve the issue *before* name collisions happen.

If anything, I think that PyPI canonicalization may wish to be *more* 
restrictive than setuptools' is.  There isn't a whole lot of user 
benefit to having, say, "Mike's Nifty module" and "Mikes Nifty 
Module" being considered distinct packages, even though setuptools 
actually allows that distinction to be made.

IOW, setuptools' focus is more on distribution filename safety, 
rather than on sensible naming distinctions for end users.  The 
former is less restrictive than the latter, I believe.


From jim at zope.com  Wed Jul 11 00:54:10 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 10 Jul 2007 18:54:10 -0400
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <4693FE94.6090107@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
	<4693FE94.6090107@v.loewis.de>
Message-ID: <CFA0A89E-0885-4084-8BD9-92868EF21B63@zope.com>


On Jul 10, 2007, at 5:48 PM, Martin v. L?wis wrote:

>> I've seen requests take minutes and time out with proxy errors many
>> times over the last few weeks.  We, ZC, and many people we work  
>> with are
>> at the point of building private indexes to get around the horrible
>> performance.
>
> I still don't understand why you consider this an easier option than
> contributing to the existing project.

I don't. I'm not advocating it.  In fact, I've been trying to  
convince people not to.

People are doing it, usually in limited ways, out of desperation.

...

>> Maybe others can chime in.
>
> That's also my concern. Nobody else is complaining; AFAICT, there
> is just one unhappy user of PyPI.

Oh come on, I'm not the only one who has posted messages on this  
mailing list over the last few weeks reporting problems.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Wed Jul 11 00:55:57 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 10 Jul 2007 18:55:57 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -   apache config.
In-Reply-To: <4693FA2A.3020107@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
	<46931A3A.5000703@v.loewis.de>
	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
	<4693FA2A.3020107@v.loewis.de>
Message-ID: <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>


On Jul 10, 2007, at 5:29 PM, Martin v. L?wis wrote:
...
> Hmm. I'm somewhat skeptical about setuptools (or any other packaging
> infrastructure, say, Debian) establishing rules on what makes a
> difference in package names.

Why?  It certainly seems reasonable to me for a packaging system to  
define rules for package names.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Wed Jul 11 01:03:09 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 10 Jul 2007 19:03:09 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -   apache config.
In-Reply-To: <20070710221547.4A3043A40A4@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
	<46931A3A.5000703@v.loewis.de>
	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
	<4693FA2A.3020107@v.loewis.de>
	<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
Message-ID: <DDB69434-3EDA-4A4B-99BE-E5525AF79E1E@zope.com>


On Jul 10, 2007, at 6:18 PM, Phillip J. Eby wrote:
> At 11:29 PM 7/10/2007 +0200, Martin v. L?wis wrote:
>> Hmm. I'm somewhat skeptical about setuptools (or any other packaging
>> infrastructure, say, Debian) establishing rules on what makes a
>> difference in package names.
>
> I can certainly understand that.  However, *having* SOME definition  
> that's more human-friendly (and cross-platform filename friendly!)  
> than "the bytes match exactly", would be very useful to have.
>
> If PyPI had already had one (and I asked about this when I was  
> first trying to establish one) I'd have used that, or negotiated a  
> compromise if it didn't meet the filename-related requirements.
>
> However, none of the times that I asked about this issue on either  
> the catalog-sig nor the distutils-sig did anyone propose any  
> alternative canonicalization, nor bring up any objection besides  
> the general sort of reservation that you're expressing here - i.e.,  
> not sure it's a good idea, but not expressing any particular reason  
> it's a bad idea.

I think it is time we (the Python community) nailed this down.   
Perhaps a distribution project-name naming PEP is in order.

>
> Note that Windows (and Mac OS under certain circumstances) have  
> filename case insensitivity, and have different restrictions about  
> what can or can't be in a filename than Unix.  Spaces and other  
> punctuation characters can cause problems for shells, even if  
> they're theoretically acceptable as filenames.

Why should this imply case insensitivity of distribution project  
names.  Python has case sensitive module (including package) names  
that can lead to problems if two modules have names that differ only  
in case.  (I assume that Python 3000 retains this although, sadly, I  
don't know.)  We deal with this by telling people "don't do that."   
Two packages with the same name except for case are incompatible, but  
then, so are modules with incompatible dependencies.

> If you'd like to propose a *different* canonicalization, however,  
> I'm certainly willing to consider implementing it in setuptools, if  
> it can be done.  However, as I said, nobody has proposed anything  
> else, but it would be nice to resolve the issue *before* name  
> collisions happen.
>
> If anything, I think that PyPI canonicalization may wish to be  
> *more* restrictive than setuptools' is.  There isn't a whole lot of  
> user benefit to having, say, "Mike's Nifty module" and "Mikes Nifty  
> Module" being considered distinct packages, even though setuptools  
> actually allows that distinction to be made.
>
> IOW, setuptools' focus is more on distribution filename safety,  
> rather than on sensible naming distinctions for end users.  The  
> former is less restrictive than the latter, I believe.

I don't care much what canonicalization we use, but I agree strongly  
that we should decide something.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From pje at telecommunity.com  Wed Jul 11 02:12:54 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 10 Jul 2007 20:12:54 -0400
Subject: [Catalog-sig] start on static generation,
 and caching -    apache config.
In-Reply-To: <DDB69434-3EDA-4A4B-99BE-E5525AF79E1E@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
	<46931A3A.5000703@v.loewis.de>
	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
	<4693FA2A.3020107@v.loewis.de>
	<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
	<DDB69434-3EDA-4A4B-99BE-E5525AF79E1E@zope.com>
Message-ID: <20070711001040.D6DBF3A404D@sparrow.telecommunity.com>

At 07:03 PM 7/10/2007 -0400, Jim Fulton wrote:
>Why should this imply case insensitivity of distribution project
>names.  Python has case sensitive module (including package) names
>that can lead to problems if two modules have names that differ only
>in case.

Module names are identifiers, with an already-restricted character 
set.  Package names are strings, and many people (especially those 
who enter their PyPI data through the web) assume they can put 
whatever the heck they want in there.


>   (I assume that Python 3000 retains this although, sadly, I
>don't know.)  We deal with this by telling people "don't do that."

Right...  and PyPI's input validation would be a good place to tell them.  :)


>Two packages with the same name except for case are incompatible, but
>then, so are modules with incompatible dependencies.

Compatibility isn't the only concern, it's also about confusion as to 
which package is which.  While one can't legislate away confusion, 
fixing simple, obvious errors that can and *do* occur in practice 
(like one package name having one space in it, the other having two!) 
is a good idea.

One of the things that prompted my search for a canonicalization 
strategy was my survey of existing CheeseShop packages, which 
actually included a certain amount of duplication due to changes in 
case or punctuation at one point.  (I believe the specific instances 
were fixed a long time ago, although I wouldn't rule out the 
possibility that some still exist.)



From srichter at cosmos.phy.tufts.edu  Wed Jul 11 02:16:40 2007
From: srichter at cosmos.phy.tufts.edu (Stephan Richter)
Date: Tue, 10 Jul 2007 20:16:40 -0400
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
Message-ID: <200707102016.40669.srichter@cosmos.phy.tufts.edu>

Hi all,

Jim Fulton forwarded this exchange to the Zope3-Dev mailing lsit asking for us 
to comment.

> On Jul 10, 2007, at 5:48 PM, Martin v. L?wis wrote:
>> That's also my concern. Nobody else is complaining; AFAICT, there
>> is just one unhappy user of PyPI.
>
> Oh come on, I'm not the only one who has posted messages on this
> mailing list over the last few weeks reporting problems.

I can assure you that I have had several times troubles with performance. One 
Friday I could not even finish my release, because I could not upload to PyPI 
or test the release since the packages were not downloaded after 5 hours!

Regards,
Stephan
-- 
Stephan Richter
CBU Physics & Chemistry (B.S.) / Tufts Physics (Ph.D. student)
Web2k - Web Software Design, Development and Training

From waterbug at pangalactic.us  Wed Jul 11 04:55:30 2007
From: waterbug at pangalactic.us (Stephen Waterbury)
Date: Tue, 10 Jul 2007 22:55:30 -0400
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <4693FE94.6090107@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
	<4693FE94.6090107@v.loewis.de>
Message-ID: <469446A2.9070500@pangalactic.us>

Martin v. L?wis wrote:
> [Jim Fulton wrote:]
>> Maybe others can chime in.
> 
> That's also my concern. Nobody else is complaining; AFAICT, there
> is just one unhappy user of PyPI.

I'm not happy with PyPI's performance either.
Probably many users are like me:  I thought it was
common knowledge that the performance of PyPI was bad, but
I didn't want to complain when it appeared that people were
working on improvements.

Steve

From richardjones at optusnet.com.au  Wed Jul 11 06:04:24 2007
From: richardjones at optusnet.com.au (richardjones at optusnet.com.au)
Date: Wed, 11 Jul 2007 14:04:24 +1000
Subject: [Catalog-sig] start on static generation,
 and caching -    apache config.
Message-ID: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>

An embedded and charset-unspecified text was scrubbed...
Name: not available
Url: http://mail.python.org/pipermail/catalog-sig/attachments/20070711/c7f5e06a/attachment.pot 

From martin at v.loewis.de  Wed Jul 11 07:16:26 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 07:16:26 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -    apache config.
In-Reply-To: <20070710221547.4A3043A40A4@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
	<46931A3A.5000703@v.loewis.de>
	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
	<4693FA2A.3020107@v.loewis.de>
	<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
Message-ID: <469467AA.7070409@v.loewis.de>

> Note that Windows (and Mac OS under certain circumstances) have filename
> case insensitivity, and have different restrictions about what can or
> can't be in a filename than Unix.  Spaces and other punctuation
> characters can cause problems for shells, even if they're theoretically
> acceptable as filenames.

I can see that collisions should be avoided in advance when it comes to
file names. However, the name of a software package is not necessarily a
file name, nor is it even related to the name of files inside the
package.

*Python* package names are the ones that must not conflict. For
a packaging tool, the names of the package files must not conflict,
either. For the package names in general, issues of file names
are only remotely relevant, on a first glance.

> IOW, setuptools' focus is more on distribution filename safety, rather
> than on sensible naming distinctions for end users.  The former is less
> restrictive than the latter, I believe.

Yes. However, it's not clear to me that the infrastructure needs to
(or even is able to) enforce sensible naming. Instead, any policing
that might be necessary should be done in the community. If two
packages are named too similarly, users will get confused, and
eventually one package may disappear, get renamed, get its naming
challenged in court, and so on. It's not the job of the package
*index* to do that sort of policing.

Regards,
Martin


From martin at v.loewis.de  Wed Jul 11 07:19:45 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 07:19:45 +0200
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <CFA0A89E-0885-4084-8BD9-92868EF21B63@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
	<4693FE94.6090107@v.loewis.de>
	<CFA0A89E-0885-4084-8BD9-92868EF21B63@zope.com>
Message-ID: <46946871.3060100@v.loewis.de>

> People are doing it, usually in limited ways, out of desperation.

Same question to these people, then (whoever they are): why
do you think it's easier to build your own index in desperation,
rather than contributing to PyPI?

>> That's also my concern. Nobody else is complaining; AFAICT, there
>> is just one unhappy user of PyPI.
> 
> Oh come on, I'm not the only one who has posted messages on this mailing
> list over the last few weeks reporting problems.

Can you kindly refer to four or five such messages in the archives?
I must have missed them.

Regards,
Martin

From waterbug at pangalactic.us  Wed Jul 11 07:20:05 2007
From: waterbug at pangalactic.us (Stephen Waterbury)
Date: Wed, 11 Jul 2007 01:20:05 -0400
Subject: [Catalog-sig] start on static generation,
 and caching -    apache config.
In-Reply-To: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
Message-ID: <46946885.8080100@pangalactic.us>

richardjones at optusnet.com.au wrote:
> Stephen Waterbury <waterbug at pangalactic.us> wrote:
>> Martin v. L??wis wrote:
>>> [Jim Fulton wrote:]
>>>> Maybe others can chime in.
>>> That's also my concern. Nobody else is complaining; AFAICT, there
>>> is just one unhappy user of PyPI.
>> I'm not happy with PyPI's performance either.
>> Probably many users are like me:  I thought it was
>> common knowledge that the performance of PyPI was bad, but
>> I didn't want to complain when it appeared that people were
>> working on improvements.
> 
> It has been slow in the past, but Martin has done some great work
> speeding it up in the last few days. If it's still slow, please
> report when you noticed and what you were trying to do.

I agree, Martin's improvements have made a huge difference, in my
recent experience.  Thanks, Martin!  I inferred from the conversation
that the performance is variable, and I think my tests of it have been
in off-peak times, so my current impressions should be regarded as
anecdotal ... another reason why I hadn't volunteered my opinion until
this request for input.

Steve

From martin at v.loewis.de  Wed Jul 11 07:21:09 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 07:21:09 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -   apache config.
In-Reply-To: <069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>	<46931A3A.5000703@v.loewis.de>	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>	<4693FA2A.3020107@v.loewis.de>
	<069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
Message-ID: <469468C5.8000906@v.loewis.de>

>> Hmm. I'm somewhat skeptical about setuptools (or any other packaging
>> infrastructure, say, Debian) establishing rules on what makes a
>> difference in package names.
> 
> Why?  It certainly seems reasonable to me for a packaging system to  
> define rules for package names.

Ah, sure. It's certainly fine and reasonable for a packaging system
to do that for its own purposes. However, I'm skeptical about that
packaging system then to enforce its rules on other systems (such
as the cheeseshop, which is not packaging system).

Regards,
Martin

From martin at v.loewis.de  Wed Jul 11 07:28:09 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 07:28:09 +0200
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <469446A2.9070500@pangalactic.us>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>	<4693FE94.6090107@v.loewis.de>
	<469446A2.9070500@pangalactic.us>
Message-ID: <46946A69.4000702@v.loewis.de>

> I'm not happy with PyPI's performance either.
> Probably many users are like me:  I thought it was
> common knowledge that the performance of PyPI was bad

Please trust me that it isn't. I know that PyPI could
become unresponsive, and I FIXED that. AFAICT, it's
solved, done, can't happen again. I do not know that
performance IS bad; I know that it WAS bad (primarily
not due to the way the software was written, but
due to the way it was run).

> but
> I didn't want to complain when it appeared that people were
> working on improvements.

Sure: mere complaints would not be constructive. However,
specific *reports* of problems are absolutely necessary.
If you experience problems today, tomorrow, next week,
by all means, report them. Different people apparently
also have different perception what good performance is,
so please always make a full bug report:

- what precisely did you do (including "when" also
  in this case),
- what happened,
- what did you expect to happen instead

Regards,
Martin

From martin at v.loewis.de  Wed Jul 11 07:31:06 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 07:31:06 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -    apache config.
In-Reply-To: <46946885.8080100@pangalactic.us>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
	<46946885.8080100@pangalactic.us>
Message-ID: <46946B1A.9040004@v.loewis.de>

> I agree, Martin's improvements have made a huge difference, in my
> recent experience.  Thanks, Martin!  I inferred from the conversation
> that the performance is variable, and I think my tests of it have been
> in off-peak times, so my current impressions should be regarded as
> anecdotal ... another reason why I hadn't volunteered my opinion until
> this request for input.

Ah, ok. Please take a look at

http://ximinez.python.org/munin/localdomain/localhost.localdomain-load.html

Times are in CEST (UTC+2), so the peak load occurred during the times
I was asleep - I never personally see any significant load on the system
anymore. If you also work in a similar time zone as I do, I would
consider your problems solved.

Regards,
Martin

From gentoodev at gmail.com  Wed Jul 11 07:54:48 2007
From: gentoodev at gmail.com (Rob Cakebread)
Date: Tue, 10 Jul 2007 22:54:48 -0700
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <469467AA.7070409@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
	<46931A3A.5000703@v.loewis.de>
	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
	<4693FA2A.3020107@v.loewis.de>
	<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
	<469467AA.7070409@v.loewis.de>
Message-ID: <9b06ffb10707102254i57e5c0f8gf92836805f8a0626@mail.gmail.com>

On 7/10/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>
> Yes. However, it's not clear to me that the infrastructure needs to
> (or even is able to) enforce sensible naming. Instead, any policing
> that might be necessary should be done in the community. If two
> packages are named too similarly, users will get confused, and
> eventually one package may disappear, get renamed, get its naming
> challenged in court, and so on. It's not the job of the package
> *index* to do that sort of policing.
>

Every package index I can think of does enforce sensible naming, except PyPI.

Nobody is going to change the name of their project if you enforce
sensible naming for PyPI, they'll just have to map their project name
to a way that is easily mapped to PyPI's system, just like on
Freshmeat, RubyForge etc.

From martin at v.loewis.de  Wed Jul 11 07:06:59 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 07:06:59 +0200
Subject: [Catalog-sig] Why so many zc.buildout versions?
In-Reply-To: <4D7FD5E2-7460-4A48-A1B0-C1247B0A3FB8@zope.com>
References: <46937F10.3070201@weitershausen.de>
	<73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com>
	<4693FBDC.2060201@v.loewis.de>
	<4D7FD5E2-7460-4A48-A1B0-C1247B0A3FB8@zope.com>
Message-ID: <46946573.2070400@v.loewis.de>

>> I have been thinking about the same thing. I think it would be good
>> to have, however, it will surely take some time until all setuptools
>> implementations learn to use it.
> 
> No, not at all.  You can tell setuptools to use a different index than
> the current one.  For example, this is a command-line option for
> easy_install and a configuration option for buildout.

Yes. However, that will make the feature only available to those who
know about it. I have very shallow knowledge of setuptools and
easy_install only (I nearly never use them at all), and I surely would
miss such an option, and miss why it's relevant.

It's true that the Apache installation could also redirect existing
installations to the new pages, but I doubt that they would be otherwise
widely used until setuptools changes its hard-coded default.

>> Hmm. How about those using them extensively start contributing to
>> them also?
> 
> I like to think that I am by participating in this discussion.  Actually
> changing the cheeseshop software has a very high learning curve. I don't
> think that I can make that kind of time any time soon.  I'm very
> grateful that you and Ren? are doing what you're doing.  I also suspect
> that, given your and Ren?'s activity, it would be counter productive for
> someone else to get involved at that level, but maybe I'm wrong about that.

I strongly think you are. There are many things that could be improved,
and I would not mind leaving the cheeseshop alone if some other
maintainer came along - I also have other things to do.

Regards,
Martin



From martin at v.loewis.de  Wed Jul 11 07:44:53 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 11 Jul 2007 07:44:53 +0200
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <200707102016.40669.srichter@cosmos.phy.tufts.edu>
References: <200707102016.40669.srichter@cosmos.phy.tufts.edu>
Message-ID: <46946E55.30308@v.loewis.de>

>>> That's also my concern. Nobody else is complaining; AFAICT, there
>>> is just one unhappy user of PyPI.
>> Oh come on, I'm not the only one who has posted messages on this
>> mailing list over the last few weeks reporting problems.
> 
> I can assure you that I have had several times troubles with performance. One 
> Friday I could not even finish my release, because I could not upload to PyPI 
> or test the release since the packages were not downloaded after 5 hours!

I assume you are talking about past here - I can readily believe that
has happened. I think it's fixed now, and it should not happen again
that you have to wait 5 hours to download a file (unless there is
some hardware failure, network outage or the like beyond the control
of the local software).

So yes, I trust that there have been complaints in the past - I
wonder whether there are *still* complaints (beyond the ones
of Jim Fulton).

Regards,
Martin

From benji at benjiyork.com  Wed Jul 11 14:10:30 2007
From: benji at benjiyork.com (Benji York)
Date: Wed, 11 Jul 2007 08:10:30 -0400
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <46946871.3060100@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>	<4693FE94.6090107@v.loewis.de>	<CFA0A89E-0885-4084-8BD9-92868EF21B63@zope.com>
	<46946871.3060100@v.loewis.de>
Message-ID: <4694C8B6.1030804@benjiyork.com>

Martin v. L?wis wrote:
>> People are doing it, usually in limited ways, out of desperation.
> 
> Same question to these people, then (whoever they are): why
> do you think it's easier to build your own index in desperation,
> rather than contributing to PyPI?

Because they aren't aware of the progress being made or the intent to 
make more?

>>> That's also my concern. Nobody else is complaining; AFAICT, there
>>> is just one unhappy user of PyPI.
>> Oh come on, I'm not the only one who has posted messages on this mailing
>> list over the last few weeks reporting problems.
> 
> Can you kindly refer to four or five such messages in the archives?
> I must have missed them.

Here's one (you didn't say they had to be past messages <wink>).

Is your position that PyPI isn't down/very slow on occasion or that when 
it is no one complains?

My team has lost many man hours to PyPI begin down/glacially slow.  This 
isn't meant to disparage PyPI though, if it weren't such a great thing 
it wouldn't be important to us.
-- 
Benji York
http://benjiyork.com

From renesd at gmail.com  Wed Jul 11 14:20:22 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Wed, 11 Jul 2007 22:20:22 +1000
Subject: [Catalog-sig] Why so many zc.buildout versions?
In-Reply-To: <46946573.2070400@v.loewis.de>
References: <46937F10.3070201@weitershausen.de>
	<73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com>
	<4693FBDC.2060201@v.loewis.de>
	<4D7FD5E2-7460-4A48-A1B0-C1247B0A3FB8@zope.com>
	<46946573.2070400@v.loewis.de>
Message-ID: <64ddb72c0707110520j42bb8f27nb676bcf4de39d14c@mail.gmail.com>

I have to say the cheeseshop code was pretty easy to get into.

I think I was able to make most of my changes within the first reading of it.

It quite clearly separates things like the templates, the database
functionality and the 'webui'.

There definitely are a huge amount of things that I would love to
change with it over time, and I hope other people begin to develop it
more - it can only help the python community as a whole.

The amount of people doing releases has increased quite a lot even in
the last two months, so I think the releases will get more frequent.
As it grows it will continue to need different changes - optimizations
to the database/webserver, and also optimizations to the user
interface.


On 7/11/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> >> I have been thinking about the same thing. I think it would be good
> >> to have, however, it will surely take some time until all setuptools
> >> implementations learn to use it.
> >
> > No, not at all.  You can tell setuptools to use a different index than
> > the current one.  For example, this is a command-line option for
> > easy_install and a configuration option for buildout.
>
> Yes. However, that will make the feature only available to those who
> know about it. I have very shallow knowledge of setuptools and
> easy_install only (I nearly never use them at all), and I surely would
> miss such an option, and miss why it's relevant.
>
> It's true that the Apache installation could also redirect existing
> installations to the new pages, but I doubt that they would be otherwise
> widely used until setuptools changes its hard-coded default.
>
> >> Hmm. How about those using them extensively start contributing to
> >> them also?
> >
> > I like to think that I am by participating in this discussion.  Actually
> > changing the cheeseshop software has a very high learning curve. I don't
> > think that I can make that kind of time any time soon.  I'm very
> > grateful that you and Ren? are doing what you're doing.  I also suspect
> > that, given your and Ren?'s activity, it would be counter productive for
> > someone else to get involved at that level, but maybe I'm wrong about that.
>
> I strongly think you are. There are many things that could be improved,
> and I would not mind leaving the cheeseshop alone if some other
> maintainer came along - I also have other things to do.
>
> Regards,
> Martin
>
>
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>

From benji at benjiyork.com  Wed Jul 11 14:42:16 2007
From: benji at benjiyork.com (Benji York)
Date: Wed, 11 Jul 2007 08:42:16 -0400
Subject: [Catalog-sig] start on static generation,
 and caching -   apache config.
In-Reply-To: <469468C5.8000906@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>	<46931A3A.5000703@v.loewis.de>	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>	<4693FA2A.3020107@v.loewis.de>	<069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
	<469468C5.8000906@v.loewis.de>
Message-ID: <4694D028.6050203@benjiyork.com>

Martin v. L?wis wrote:
>>> Hmm. I'm somewhat skeptical about setuptools (or any other packaging
>>> infrastructure, say, Debian) establishing rules on what makes a
>>> difference in package names.
>> Why?  It certainly seems reasonable to me for a packaging system to  
>> define rules for package names.
> 
> Ah, sure. It's certainly fine and reasonable for a packaging system
> to do that for its own purposes. However, I'm skeptical about that
> packaging system then to enforce its rules on other systems (such
> as the cheeseshop, which is not packaging system).

Although it wasn't part of the cheeseshop's original mission, it has 
become an integral part of distributing Python packages.  If it doesn't 
want to participate in its new-found utility, other options need to be 
explored.
-- 
Benji York
http://benjiyork.com

From jim at zope.com  Wed Jul 11 14:52:20 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 08:52:20 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -    apache config.
In-Reply-To: <469467AA.7070409@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
	<46931A3A.5000703@v.loewis.de>
	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
	<4693FA2A.3020107@v.loewis.de>
	<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
	<469467AA.7070409@v.loewis.de>
Message-ID: <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>


On Jul 11, 2007, at 1:16 AM, Martin v. L?wis wrote:
...
>> IOW, setuptools' focus is more on distribution filename safety,  
>> rather
>> than on sensible naming distinctions for end users.  The former is  
>> less
>> restrictive than the latter, I believe.
>
> Yes. However, it's not clear to me that the infrastructure needs to
> (or even is able to) enforce sensible naming. Instead, any policing
> that might be necessary should be done in the community. If two
> packages are named too similarly, users will get confused, and
> eventually one package may disappear, get renamed, get its naming
> challenged in court, and so on. It's not the job of the package
> *index* to do that sort of policing.

When Phillip designed setuptools, he tried to have a very low impact  
on lots of systems.  He did that very well and that has allowed  
setuptools to be adopted gradually with very little up front buy in.

One of the decisions Phillip made was to not use an installed-package  
database other than sys.path.  When a distribution is installed, the  
installed file name reflects the package name.  If you want to know  
whether a package is installed, you can scan sys.path looking for  
files or directories that contain/reflect the package name.  IMO,  
this was a very good decision, however, it does have the disadvantage  
that it may run afoul of system file-naming limitations.  Again, I  
think this was a fair trade off.

The questions for us is, how much effort we are willing to make to  
prevent people from shooting themselves in the foot.  I can  
understand why Phillip would like the package index to prevent people  
from choosing problematic package names.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Wed Jul 11 14:56:22 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 08:56:22 -0400
Subject: [Catalog-sig] Why so many zc.buildout versions?
In-Reply-To: <46946573.2070400@v.loewis.de>
References: <46937F10.3070201@weitershausen.de>
	<73FE055E-D4F4-44E7-9DEE-353601795EC2@zope.com>
	<4693FBDC.2060201@v.loewis.de>
	<4D7FD5E2-7460-4A48-A1B0-C1247B0A3FB8@zope.com>
	<46946573.2070400@v.loewis.de>
Message-ID: <7E6E8D05-9669-4765-B61D-254835DDA553@zope.com>


On Jul 11, 2007, at 1:06 AM, Martin v. L?wis wrote:

>>> I have been thinking about the same thing. I think it would be good
>>> to have, however, it will surely take some time until all setuptools
>>> implementations learn to use it.
>>
>> No, not at all.  You can tell setuptools to use a different index  
>> than
>> the current one.  For example, this is a command-line option for
>> easy_install and a configuration option for buildout.
>
> Yes. However, that will make the feature only available to those who
> know about it. I have very shallow knowledge of setuptools and
> easy_install only (I nearly never use them at all), and I surely would
> miss such an option, and miss why it's relevant.

That's fine.  I don't care if most people can find it.  While it is  
an *experimental* index, it is fine if only a few people play with  
it.  If it is proven to work properly, then we could arrange that  
other people get it by default.

> It's true that the Apache installation could also redirect existing
> installations to the new pages, but I doubt that they would be  
> otherwise
> widely used until setuptools changes its hard-coded default.

Right, that's why, if the experiment works, we should then change the  
Apache config to rediect setuptools to it.

Changing the apache config is much easier than updating the  
setuptools installed base.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Wed Jul 11 15:32:31 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 09:32:31 -0400
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <46946871.3060100@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
	<4693FE94.6090107@v.loewis.de>
	<CFA0A89E-0885-4084-8BD9-92868EF21B63@zope.com>
	<46946871.3060100@v.loewis.de>
Message-ID: <A5F9BD9D-DEC5-45B3-B026-569ED42FAC4A@zope.com>


On Jul 11, 2007, at 1:19 AM, Martin v. L?wis wrote:
...
>>> That's also my concern. Nobody else is complaining; AFAICT, there
>>> is just one unhappy user of PyPI.
>>
>> Oh come on, I'm not the only one who has posted messages on this  
>> mailing
>> list over the last few weeks reporting problems.
>
> Can you kindly refer to four or five such messages in the archives?
> I must have missed them.


http://mail.python.org/pipermail/catalog-sig/2007-June/001099.html
http://mail.python.org/pipermail/catalog-sig/2007-June/001101.html
http://mail.python.org/pipermail/catalog-sig/2007-April/001049.html
http://mail.python.org/pipermail/catalog-sig/2006-November/000997.html

There haven't been a large number of messages.

There must not be a problem.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


http://mail.python.org/pipermail/catalog-sig/2007-June/001099.html

From jim at zope.com  Wed Jul 11 15:34:41 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 09:34:41 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -   apache config.
In-Reply-To: <469468C5.8000906@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>	<46931A3A.5000703@v.loewis.de>	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>	<4693FA2A.3020107@v.loewis.de>
	<069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
	<469468C5.8000906@v.loewis.de>
Message-ID: <F4062140-DA6C-4F8C-A782-0027128BFED2@zope.com>


On Jul 11, 2007, at 1:21 AM, Martin v. L?wis wrote:

>>> Hmm. I'm somewhat skeptical about setuptools (or any other packaging
>>> infrastructure, say, Debian) establishing rules on what makes a
>>> difference in package names.
>>
>> Why?  It certainly seems reasonable to me for a packaging system to
>> define rules for package names.
>
> Ah, sure. It's certainly fine and reasonable for a packaging system
> to do that for its own purposes. However, I'm skeptical about that
> packaging system then to enforce its rules on other systems (such
> as the cheeseshop, which is not packaging system).

OK, let's take a step back.  IMO, PyPI is a *part* is the packaging  
system.  If we can't agree that that is true, then we need to find a  
package index that *is* part of the package system.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Wed Jul 11 16:03:33 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 10:03:33 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -    apache config.
In-Reply-To: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
Message-ID: <721297D4-85EA-4397-84C9-D90E5598477A@zope.com>


On Jul 11, 2007, at 12:04 AM, richardjones at optusnet.com.au wrote:

> Stephen Waterbury <waterbug at pangalactic.us> wrote:
>> Martin v. L?wis wrote:
>>> [Jim Fulton wrote:]
>>>> Maybe others can chime in.
>>>
>>> That's also my concern. Nobody else is complaining; AFAICT, there
>>> is just one unhappy user of PyPI.
>>
>> I'm not happy with PyPI's performance either.
>> Probably many users are like me:  I thought it was
>> common knowledge that the performance of PyPI was bad, but
>> I didn't want to complain when it appeared that people were
>> working on improvements.
>
> It has been slow in the past, but Martin has done some great work  
> speeding it up in the last few days.

Yup. Much thanks Martin!

> If it's still slow, please report when you noticed and what you  
> were trying to do.

Let's look at the new-improved times.  Right now ~14:00UTC July 11:

   http://www.python.org/ZODB3 takes about .3 seconds (median)(means  
is higher)
   http://www.python.org/ZODB3/3.8.0b2 also takes about .3 seconds
   http://www.python.org/pypi/ takes aabout 6 seconds (median)

For the sake of argument, let's ignore http://www.python.org/pypi/.

The .3-second times per request is *much* better than we had before  
(I assume), but it's *not fast enough*.  The demand on the package  
index used by setuptools is going to increase substantially.  Even if  
setuptools only made a single request per package, .3 seconds per  
request is too slow.  Given the current structure of the index,  
setuptools has to make a request for the package and a request per  
release.  For ZODB, this means about 12 requests, or more than 3  
seconds.  Of course, this will increase over time, as more releases  
are made.

The progress Martin has made has (I assume and hope) greatly  
increased the reliability and performance of PYPI.  This is very  
important and much appreciated.  It is not enough in the long (or, I  
suspect medium) term.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From nathan at creativecommons.org  Wed Jul 11 16:28:33 2007
From: nathan at creativecommons.org (Nathan R. Yergler)
Date: Wed, 11 Jul 2007 07:28:33 -0700
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <46946A69.4000702@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
	<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>
	<46946A69.4000702@v.loewis.de>
Message-ID: <bf7b44d50707110728t467356a6t7ee489c2cea1af7d@mail.gmail.com>

On 7/10/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > I'm not happy with PyPI's performance either.
> > Probably many users are like me:  I thought it was
> > common knowledge that the performance of PyPI was bad
>
> Please trust me that it isn't. I know that PyPI could
> become unresponsive, and I FIXED that. AFAICT, it's
> solved, done, can't happen again. I do not know that
> performance IS bad; I know that it WAS bad (primarily
> not due to the way the software was written, but
> due to the way it was run).

The speed has noticeably improved (thanks!) but as recently as Monday
PyPI was unresponsive and then returning proxy errors.  It definitely
caused us (Creative Commons) to lose productivity Monday afternoon
(PDT).

Nathan


>
> > but
> > I didn't want to complain when it appeared that people were
> > working on improvements.
>
> Sure: mere complaints would not be constructive. However,
> specific *reports* of problems are absolutely necessary.
> If you experience problems today, tomorrow, next week,
> by all means, report them. Different people apparently
> also have different perception what good performance is,
> so please always make a full bug report:
>
> - what precisely did you do (including "when" also
>   in this case),
> - what happened,
> - what did you expect to happen instead
>
> Regards,
> Martin
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>

From jodok at lovelysystems.com  Wed Jul 11 17:57:14 2007
From: jodok at lovelysystems.com (Jodok Batlogg)
Date: Wed, 11 Jul 2007 17:57:14 +0200 (CEST)
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <A5F9BD9D-DEC5-45B3-B026-569ED42FAC4A@zope.com>
Message-ID: <21138246.8381184169434589.JavaMail.root@post.webmeisterei.com>

sorry for incorrect quoting - i'm at europython and the webmailer behaves badly...

i've been complaining loudly! :)
in fact cheeseshop was unusably slow. in meanwhile we built our own index and are not depending on cheeseshop anymore. i think at least me (lovely systems) and jim (zope corporation) offered help and volunteered to pay someone to fix it.

for me, the current solution is just "tuning", but not addressing the general problem behind the current software design (that is the pypi software and parts of setuptools in general). i've been following the thread actively and like to thank especially martin for his work to get a short-term solution. nevertheless we need to solve these issues. as a lot of other projects are moving to egg-based distributions pypi is a integral part. 
baking static pages would be my first choice. 

jodok

----- Original Message -----
From: "Jim Fulton" <jim at zope.com>
To: "=?ISO-8859-1?Q? \"Martin_v._L=F6wis\" ?=" <martin at v.loewis.de>
Cc: catalog-sig at python.org
Sent: Wednesday, July 11, 2007 4:32:31 PM (GMT+0200) Europe/Athens
Subject: Re: [Catalog-sig] start on static generation, and caching - apache config.


On Jul 11, 2007, at 1:19 AM, Martin v. L?wis wrote:
...
>>> That's also my concern. Nobody else is complaining; AFAICT, there
>>> is just one unhappy user of PyPI.
>>
>> Oh come on, I'm not the only one who has posted messages on this  
>> mailing
>> list over the last few weeks reporting problems.
>
> Can you kindly refer to four or five such messages in the archives?
> I must have missed them.


http://mail.python.org/pipermail/catalog-sig/2007-June/001099.html
http://mail.python.org/pipermail/catalog-sig/2007-June/001101.html
http://mail.python.org/pipermail/catalog-sig/2007-April/001049.html
http://mail.python.org/pipermail/catalog-sig/2006-November/000997.html

There haven't been a large number of messages.

There must not be a problem.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org


http://mail.python.org/pipermail/catalog-sig/2007-June/001099.html
_______________________________________________
Catalog-SIG mailing list
Catalog-SIG at python.org
http://mail.python.org/mailman/listinfo/catalog-sig


-- 
Lovely Systems, Partner

phone: +43 5572 908060, fax: +43 5572 908060-77
Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria


From martin at v.loewis.de  Wed Jul 11 19:40:34 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 19:40:34 +0200
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <bf7b44d50707110728t467356a6t7ee489c2cea1af7d@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	
	<468FF69B.2090503@v.loewis.de>	
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	
	<46910BBF.3010308@v.loewis.de>	
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	
	<4692B3A3.5030209@v.loewis.de>	
	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>	
	<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>	
	<46946A69.4000702@v.loewis.de>
	<bf7b44d50707110728t467356a6t7ee489c2cea1af7d@mail.gmail.com>
Message-ID: <46951612.9010009@v.loewis.de>

> The speed has noticeably improved (thanks!) but as recently as Monday
> PyPI was unresponsive and then returning proxy errors.  It definitely
> caused us (Creative Commons) to lose productivity Monday afternoon
> (PDT).

Ok. What precisely was that proxy error? (I'm puzzled, because I'm
not aware of a proxy somewhere)

Regards,
Martin

From fdrake at gmail.com  Wed Jul 11 19:42:04 2007
From: fdrake at gmail.com (Fred Drake)
Date: Wed, 11 Jul 2007 13:42:04 -0400
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <bf7b44d50707110728t467356a6t7ee489c2cea1af7d@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
	<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>
	<46946A69.4000702@v.loewis.de>
	<bf7b44d50707110728t467356a6t7ee489c2cea1af7d@mail.gmail.com>
Message-ID: <9cee7ab80707111042w68b5c8e7sf220dc2cf4011bfd@mail.gmail.com>

On 7/11/07, Nathan R. Yergler <nathan at creativecommons.org> wrote:
> The speed has noticeably improved (thanks!) but as recently as Monday
> PyPI was unresponsive and then returning proxy errors.  It definitely
> caused us (Creative Commons) to lose productivity Monday afternoon
> (PDT).

We're seeing this right now, too.  I'm checking both www.python.org
and cheeseshop.python.org.


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at gmail.com>
"Chaos is the score upon which reality is written." --Henry Miller

From nathan at creativecommons.org  Wed Jul 11 19:47:33 2007
From: nathan at creativecommons.org (Nathan R. Yergler)
Date: Wed, 11 Jul 2007 10:47:33 -0700
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <46951612.9010009@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
	<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>
	<46946A69.4000702@v.loewis.de>
	<bf7b44d50707110728t467356a6t7ee489c2cea1af7d@mail.gmail.com>
	<46951612.9010009@v.loewis.de>
Message-ID: <bf7b44d50707111047u7e2ae963t9395e769ba827d@mail.gmail.com>

On 7/11/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > The speed has noticeably improved (thanks!) but as recently as Monday
> > PyPI was unresponsive and then returning proxy errors.  It definitely
> > caused us (Creative Commons) to lose productivity Monday afternoon
> > (PDT).
>
> Ok. What precisely was that proxy error? (I'm puzzled, because I'm
> not aware of a proxy somewhere)

IIRC it was a 503 or 502 -- if I had to guess, it appeared that Apache
is passing requests through to a local process (mod_rewrite or
mod_proxy?), and that process wasn't responding.

>
> Regards,
> Martin
>

From jim at zope.com  Wed Jul 11 19:50:01 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 13:50:01 -0400
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <46951612.9010009@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	
	<468FF69B.2090503@v.loewis.de>	
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	
	<46910BBF.3010308@v.loewis.de>	
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	
	<4692B3A3.5030209@v.loewis.de>	
	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>	
	<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>	
	<46946A69.4000702@v.loewis.de>
	<bf7b44d50707110728t467356a6t7ee489c2cea1af7d@mail.gmail.com>
	<46951612.9010009@v.loewis.de>
Message-ID: <F64D623E-D53A-4A5C-A96C-80B2225C6BE3@zope.com>


On Jul 11, 2007, at 1:40 PM, Martin v. L?wis wrote:

>> The speed has noticeably improved (thanks!) but as recently as Monday
>> PyPI was unresponsive and then returning proxy errors.  It definitely
>> caused us (Creative Commons) to lose productivity Monday afternoon
>> (PDT).
>
> Ok. What precisely was that proxy error? (I'm puzzled, because I'm
> not aware of a proxy somewhere)

Here's the error I just got after several minutes of spinning trying  
to get: http://www.python.org/pypi/ZODB3

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>503 Service Temporarily Unavailable</title>
</head><body>
<h1>Service Temporarily Unavailable</h1>
<p>The server is temporarily unable to service your
request due to maintenance downtime or capacity
problems. Please try again later.</p>
</body></html>

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From benji at benjiyork.com  Wed Jul 11 19:50:39 2007
From: benji at benjiyork.com (Benji York)
Date: Wed, 11 Jul 2007 13:50:39 -0400
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <46946E55.30308@v.loewis.de>
References: <200707102016.40669.srichter@cosmos.phy.tufts.edu>
	<46946E55.30308@v.loewis.de>
Message-ID: <4695186F.3030207@benjiyork.com>

Martin v. L?wis wrote:
> So yes, I trust that there have been complaints in the past - I
> wonder whether there are *still* complaints (beyond the ones
> of Jim Fulton).

Here's a complaint: the cheeseshop is down.
-- 
Benji York
http://benjiyork.com

From fdrake at gmail.com  Wed Jul 11 19:50:56 2007
From: fdrake at gmail.com (Fred Drake)
Date: Wed, 11 Jul 2007 13:50:56 -0400
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <bf7b44d50707111047u7e2ae963t9395e769ba827d@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
	<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>
	<46946A69.4000702@v.loewis.de>
	<bf7b44d50707110728t467356a6t7ee489c2cea1af7d@mail.gmail.com>
	<46951612.9010009@v.loewis.de>
	<bf7b44d50707111047u7e2ae963t9395e769ba827d@mail.gmail.com>
Message-ID: <9cee7ab80707111050v1573ec23s7e48e8a09bec1d1c@mail.gmail.com>

On 7/11/07, Nathan R. Yergler <nathan at creativecommons.org> wrote:
> IIRC it was a 503 or 502 -- if I had to guess, it appeared that Apache
> is passing requests through to a local process (mod_rewrite or
> mod_proxy?), and that process wasn't responding.

Firefox's "Page Info" says 503.


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at gmail.com>
"Chaos is the score upon which reality is written." --Henry Miller

From nathan at creativecommons.org  Wed Jul 11 19:55:46 2007
From: nathan at creativecommons.org (Nathan R. Yergler)
Date: Wed, 11 Jul 2007 10:55:46 -0700
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <9cee7ab80707111050v1573ec23s7e48e8a09bec1d1c@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<4692B3A3.5030209@v.loewis.de>
	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
	<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>
	<46946A69.4000702@v.loewis.de>
	<bf7b44d50707110728t467356a6t7ee489c2cea1af7d@mail.gmail.com>
	<46951612.9010009@v.loewis.de>
	<bf7b44d50707111047u7e2ae963t9395e769ba827d@mail.gmail.com>
	<9cee7ab80707111050v1573ec23s7e48e8a09bec1d1c@mail.gmail.com>
Message-ID: <bf7b44d50707111055l60471c19id029a42c9774c27a@mail.gmail.com>

I'm getting the following right now:

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>502 Proxy Error</title>
</head><body>
<h1>Proxy Error</h1>
<p>The proxy server received an invalid
response from an upstream server.<br />
The proxy server could not handle the request <em><a
href="/pypi">GET&nbsp;/pypi</a></em>.<p>
Reason: <strong>Error reading from remote server</strong></p></p>

</body></html>


On 7/11/07, Fred Drake <fdrake at gmail.com> wrote:
> On 7/11/07, Nathan R. Yergler <nathan at creativecommons.org> wrote:
> > IIRC it was a 503 or 502 -- if I had to guess, it appeared that Apache
> > is passing requests through to a local process (mod_rewrite or
> > mod_proxy?), and that process wasn't responding.
>
> Firefox's "Page Info" says 503.
>
>
>   -Fred
>
> --
> Fred L. Drake, Jr.    <fdrake at gmail.com>
> "Chaos is the score upon which reality is written." --Henry Miller
>

From martin at v.loewis.de  Wed Jul 11 20:01:59 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 20:01:59 +0200
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <4694C8B6.1030804@benjiyork.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>	<4693FE94.6090107@v.loewis.de>	<CFA0A89E-0885-4084-8BD9-92868EF21B63@zope.com>	<46946871.3060100@v.loewis.de>
	<4694C8B6.1030804@benjiyork.com>
Message-ID: <46951B17.4000104@v.loewis.de>

Benji York schrieb:
> Martin v. L?wis wrote:
>>> People are doing it, usually in limited ways, out of desperation.
>> Same question to these people, then (whoever they are): why
>> do you think it's easier to build your own index in desperation,
>> rather than contributing to PyPI?
> 
> Because they aren't aware of the progress being made or the intent to 
> make more?

And then, why didn't they ask how they could help?

People can start all the projects they want, of course. It just seems
like a waste of volunteer time to work on competing projects.

> Here's one (you didn't say they had to be past messages <wink>).

And indeed, I'm more interested in new reports than in old ones
(since the system changed since the old ones).

> Is your position that PyPI isn't down/very slow on occasion or that when 
> it is no one complains?

Both. I believe it shouldn't be down, and I have no precise reports of
it being "very slow". Jim Fulton complained that it took 0.3s to
get a single package's page, which I cannot classify as "very slow".

> My team has lost many man hours to PyPI begin down/glacially slow.  This 
> isn't meant to disparage PyPI though, if it weren't such a great thing 
> it wouldn't be important to us.

But when did that happen precisely?

Regards,
Martin

From martin at v.loewis.de  Wed Jul 11 20:03:01 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 20:03:01 +0200
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <bf7b44d50707111047u7e2ae963t9395e769ba827d@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	
	<46910BBF.3010308@v.loewis.de>	
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	
	<4692B3A3.5030209@v.loewis.de>	
	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>	
	<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>	
	<46946A69.4000702@v.loewis.de>	
	<bf7b44d50707110728t467356a6t7ee489c2cea1af7d@mail.gmail.com>	
	<46951612.9010009@v.loewis.de>
	<bf7b44d50707111047u7e2ae963t9395e769ba827d@mail.gmail.com>
Message-ID: <46951B55.9050009@v.loewis.de>

>> Ok. What precisely was that proxy error? (I'm puzzled, because I'm
>> not aware of a proxy somewhere)
> 
> IIRC it was a 503 or 502 -- if I had to guess, it appeared that Apache
> is passing requests through to a local process (mod_rewrite or
> mod_proxy?), and that process wasn't responding.

Neither is going on for PyPI, AFAIK - it's mod_fastcgi.

Regards,
Martin

From nathan at creativecommons.org  Wed Jul 11 20:06:20 2007
From: nathan at creativecommons.org (Nathan R. Yergler)
Date: Wed, 11 Jul 2007 11:06:20 -0700
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <46951B55.9050009@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<4692B3A3.5030209@v.loewis.de>
	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
	<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>
	<46946A69.4000702@v.loewis.de>
	<bf7b44d50707110728t467356a6t7ee489c2cea1af7d@mail.gmail.com>
	<46951612.9010009@v.loewis.de>
	<bf7b44d50707111047u7e2ae963t9395e769ba827d@mail.gmail.com>
	<46951B55.9050009@v.loewis.de>
Message-ID: <bf7b44d50707111106o5e8278b6vc94a057202c2a026@mail.gmail.com>

On 7/11/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> >> Ok. What precisely was that proxy error? (I'm puzzled, because I'm
> >> not aware of a proxy somewhere)
> >
> > IIRC it was a 503 or 502 -- if I had to guess, it appeared that Apache
> > is passing requests through to a local process (mod_rewrite or
> > mod_proxy?), and that process wasn't responding.
>
> Neither is going on for PyPI, AFAIK - it's mod_fastcgi.
>

So perhaps the external fastcgi server has barfed?  Like I said, I was
just guessing based on past experience.  I don't know enough about the
internals of PyPI to actually comment on how applicable that
experience is.

NRY

From benji at benjiyork.com  Wed Jul 11 20:25:58 2007
From: benji at benjiyork.com (Benji York)
Date: Wed, 11 Jul 2007 14:25:58 -0400
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <46951B17.4000104@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>	<4693FE94.6090107@v.loewis.de>	<CFA0A89E-0885-4084-8BD9-92868EF21B63@zope.com>	<46946871.3060100@v.loewis.de>
	<4694C8B6.1030804@benjiyork.com> <46951B17.4000104@v.loewis.de>
Message-ID: <469520B6.2030002@benjiyork.com>

Martin v. L?wis wrote:
> Benji York schrieb:

>> Is your position that PyPI isn't down/very slow on occasion or that when 
>> it is no one complains?
> 
> Both. I believe it shouldn't be down

The cheeseshop has provided its own proof that that believe is mistaken 
by being down as I began composing this message. <wink>

> Jim Fulton complained that it took 0.3s to
> get a single package's page, which I cannot classify as "very slow".

During a single run setuptools or zc.buildout may make hundreds of 
requests to the cheeseshop taking a total time in the minutes.  That's 
not fast enough.  I can't see a technical reason why these requests 
couldn't be handled much faster than 3 a second.

>> My team has lost many man hours to PyPI begin down/glacially slow.  This 
>> isn't meant to disparage PyPI though, if it weren't such a great thing 
>> it wouldn't be important to us.
> 
> But when did that happen precisely?

I don't recall precisely.  I'll be sure to report outages religiously 
from now on.
-- 
Benji York
http://benjiyork.com

From martin at v.loewis.de  Wed Jul 11 20:27:00 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 20:27:00 +0200
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <bf7b44d50707111055l60471c19id029a42c9774c27a@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<4692B3A3.5030209@v.loewis.de>	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>	<4693FE94.6090107@v.loewis.de>
	<469446A2.9070500@pangalactic.us>	<46946A69.4000702@v.loewis.de>	<bf7b44d50707110728t467356a6t7ee489c2cea1af7d@mail.gmail.com>	<46951612.9010009@v.loewis.de>	<bf7b44d50707111047u7e2ae963t9395e769ba827d@mail.gmail.com>	<9cee7ab80707111050v1573ec23s7e48e8a09bec1d1c@mail.gmail.com>
	<bf7b44d50707111055l60471c19id029a42c9774c27a@mail.gmail.com>
Message-ID: <469520F4.2050708@v.loewis.de>

Nathan R. Yergler schrieb:
> I'm getting the following right now:
> 
> <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
> <html><head>
> <title>502 Proxy Error</title>
> </head><body>
> <h1>Proxy Error</h1>
> <p>The proxy server received an invalid
> response from an upstream server.<br />
> The proxy server could not handle the request <em><a
> href="/pypi">GET&nbsp;/pypi</a></em>.<p>
> Reason: <strong>Error reading from remote server</strong></p></p>
> 
> </body></html>

Thanks for all the reports. I'm really puzzled what precisely
happened. Apache has logged tons of the error messages

[Wed Jul 11 20:11:01 2007] [warn] FastCGI: server
"/data/pypi/src/pypi/pypi.fcgi" has failed to remain running for 30
seconds given 3 attempts, its restart interval has been backed off to
600 seconds

That caused the outage: the PyPI FCGI servers stopped, and failed
to restart, so FCGI backed off starting new ones.

However, I don't understand why PyPI crashed - it did not leave
a log message, and did not send an error email. After restarting
it, it seems to run just fine. The first crashed server was started
7:56 (UTC+2), and, at 11:04, the line

[warn] FastCGI: server "/data/pypi/src/pypi/pypi.fcgi" (pid 3770)
terminated by calling exit with status '0'

was logged, i.e. PyPI voluntarily decided to exit. The same happened
later again and again, but I can't figure out why it would do such
a thing.

Regards,
Martin

From martin at v.loewis.de  Wed Jul 11 20:27:57 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 20:27:57 +0200
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <bf7b44d50707111106o5e8278b6vc94a057202c2a026@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	
	<4692B3A3.5030209@v.loewis.de>	
	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>	
	<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>	
	<46946A69.4000702@v.loewis.de>	
	<bf7b44d50707110728t467356a6t7ee489c2cea1af7d@mail.gmail.com>	
	<46951612.9010009@v.loewis.de>	
	<bf7b44d50707111047u7e2ae963t9395e769ba827d@mail.gmail.com>	
	<46951B55.9050009@v.loewis.de>
	<bf7b44d50707111106o5e8278b6vc94a057202c2a026@mail.gmail.com>
Message-ID: <4695212D.6010406@v.loewis.de>

> So perhaps the external fastcgi server has barfed?  Like I said, I was
> just guessing based on past experience.  I don't know enough about the
> internals of PyPI to actually comment on how applicable that
> experience is.

I just looked into it a little - that happened, but I don't know why.

Regards,
Martin

From martin at v.loewis.de  Wed Jul 11 20:32:22 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 20:32:22 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -   apache config.
In-Reply-To: <4694D028.6050203@benjiyork.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>	<46931A3A.5000703@v.loewis.de>	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>	<4693FA2A.3020107@v.loewis.de>	<069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
	<469468C5.8000906@v.loewis.de> <4694D028.6050203@benjiyork.com>
Message-ID: <46952236.30704@v.loewis.de>

> Although it wasn't part of the cheeseshop's original mission, it has
> become an integral part of distributing Python packages.  If it doesn't
> want to participate in its new-found utility, other options need to be
> explored.

It's a software system; it doesn't have a mission.

I just dislike making unilateral decisions.

Regards,
Martin

From martin at v.loewis.de  Wed Jul 11 20:35:02 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 20:35:02 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -    apache config.
In-Reply-To: <7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
	<46931A3A.5000703@v.loewis.de>
	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
	<4693FA2A.3020107@v.loewis.de>
	<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
	<469467AA.7070409@v.loewis.de>
	<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
Message-ID: <469522D6.1070706@v.loewis.de>

> The questions for us is, how much effort we are willing to make to
> prevent people from shooting themselves in the foot.  I can understand
> why Phillip would like the package index to prevent people from choosing
> problematic package names.

That's not my understanding - the issue isn't with "problematic package
names", but with conflicting package names. IOW, any single name is
fine - it's a pair of names that would cause a problem (and only if
you wanted to install both packages on the same system).

Regards,
Martin

From martin at v.loewis.de  Wed Jul 11 20:36:57 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 20:36:57 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -   apache config.
In-Reply-To: <F4062140-DA6C-4F8C-A782-0027128BFED2@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>	<46931A3A.5000703@v.loewis.de>	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>	<4693FA2A.3020107@v.loewis.de>
	<069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
	<469468C5.8000906@v.loewis.de>
	<F4062140-DA6C-4F8C-A782-0027128BFED2@zope.com>
Message-ID: <46952349.5050606@v.loewis.de>

> OK, let's take a step back.  IMO, PyPI is a *part* is the packaging
> system.  If we can't agree that that is true, then we need to find a
> package index that *is* part of the package system.

It might be hairsplitting to discuss this specific question, but
I think the purpose of PyPI is to allow people to find Python
packages, i.e. it is a package index.

Regards,
Martin

From martin at v.loewis.de  Wed Jul 11 20:40:29 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 20:40:29 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -    apache config.
In-Reply-To: <721297D4-85EA-4397-84C9-D90E5598477A@zope.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
	<721297D4-85EA-4397-84C9-D90E5598477A@zope.com>
Message-ID: <4695241D.3090203@v.loewis.de>

> The .3-second times per request is *much* better than we had before  
> (I assume), but it's *not fast enough*.  The demand on the package  
> index used by setuptools is going to increase substantially.  Even if  
> setuptools only made a single request per package, .3 seconds per  
> request is too slow.  Given the current structure of the index,  
> setuptools has to make a request for the package and a request per  
> release.  For ZODB, this means about 12 requests, or more than 3  
> seconds.  Of course, this will increase over time, as more releases  
> are made.

This I still don't understand. Why does it need to query all available
releases?

Regards,
Martin


From benji at benjiyork.com  Wed Jul 11 20:41:32 2007
From: benji at benjiyork.com (Benji York)
Date: Wed, 11 Jul 2007 14:41:32 -0400
Subject: [Catalog-sig] start on static generation,
 and caching -   apache config.
In-Reply-To: <46952236.30704@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>	<46931A3A.5000703@v.loewis.de>	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>	<4693FA2A.3020107@v.loewis.de>	<069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
	<469468C5.8000906@v.loewis.de> <4694D028.6050203@benjiyork.com>
	<46952236.30704@v.loewis.de>
Message-ID: <4695245C.3020703@benjiyork.com>

Martin v. L?wis wrote:
>> Although it wasn't part of the cheeseshop's original mission, it has
>> become an integral part of distributing Python packages.  If it doesn't
>> want to participate in its new-found utility, other options need to be
>> explored.
> 
> It's a software system; it doesn't have a mission.

This SIG has a mission, I was under the impression that the cheeseshop 
was developed to forward that mission.  If not, we need to start work on 
something that will provide a usable server-side compliment to setuptools.

> I just dislike making unilateral decisions.

Fortunately you don't have to.  We have several people here with varied 
experience that have the facilities to communicate their desires and 
expertise.
-- 
Benji York
http://benjiyork.com

From jim at zope.com  Wed Jul 11 20:45:09 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 14:45:09 -0400
Subject: [Catalog-sig] The purpose(s) of PYPI
In-Reply-To: <46952349.5050606@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>	<46931A3A.5000703@v.loewis.de>	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>	<4693FA2A.3020107@v.loewis.de>
	<069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
	<469468C5.8000906@v.loewis.de>
	<F4062140-DA6C-4F8C-A782-0027128BFED2@zope.com>
	<46952349.5050606@v.loewis.de>
Message-ID: <C91F9213-4ED8-4BF4-BBA0-4C06245BEC90@zope.com>


On Jul 11, 2007, at 2:36 PM, Martin v. L?wis wrote:

>> OK, let's take a step back.  IMO, PyPI is a *part* is the packaging
>> system.  If we can't agree that that is true, then we need to find a
>> package index that *is* part of the package system.
>
> It might be hairsplitting to discuss this specific question, but
> I think the purpose of PyPI is to allow people to find Python
> packages, i.e. it is a package index.

Let me try to put this another way.

Can we agree that it is part of the purpose of PyPI to serve as a  
repository for setuptools?  I'd like to resolve this issue.  If it  
isn't part of PyPI's purpose to serve as a repository for setuptools,  
then we'll build another system that *does* have that purpose.  If it  
is part of the purpose to serve as a repository for setuptools, then  
we'll need to take various needs of setuptools into account.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From pje at telecommunity.com  Wed Jul 11 20:37:21 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 11 Jul 2007 14:37:21 -0400
Subject: [Catalog-sig] start on static generation,
 and caching -     apache config.
In-Reply-To: <469467AA.7070409@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
	<46931A3A.5000703@v.loewis.de>
	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
	<4693FA2A.3020107@v.loewis.de>
	<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
	<469467AA.7070409@v.loewis.de>
Message-ID: <20070711184549.733CE3A404D@sparrow.telecommunity.com>

At 07:16 AM 7/11/2007 +0200, Martin v. L?wis wrote:
> > Note that Windows (and Mac OS under certain circumstances) have filename
> > case insensitivity, and have different restrictions about what can or
> > can't be in a filename than Unix.  Spaces and other punctuation
> > characters can cause problems for shells, even if they're theoretically
> > acceptable as filenames.
>
>I can see that collisions should be avoided in advance when it comes to
>file names. However, the name of a software package is not necessarily a
>file name,

Actually, it is.  The distutils generate distribution filenames based on this.


> > IOW, setuptools' focus is more on distribution filename safety, rather
> > than on sensible naming distinctions for end users.  The former is less
> > restrictive than the latter, I believe.
>
>Yes. However, it's not clear to me that the infrastructure needs to
>(or even is able to) enforce sensible naming.

I said sensible *distinctions* - not sensible naming.  Clearly, we 
can't advise people not to publish packages named "Joe's 
Miscellaneous Functions", at least not in an automated way.  :)


>  Instead, any policing
>that might be necessary should be done in the community. If two
>packages are named too similarly, users will get confused, and
>eventually one package may disappear, get renamed, get its naming
>challenged in court, and so on. It's not the job of the package
>*index* to do that sort of policing.

Within its own scope, that's a valid and sensible argument.  Within 
the larger scope of "what is good for users", I would say it does no 
*good* to allow people to register such similar package names, and in 
many cases will do *harm* to do so.

Contrariwise, it will not do *harm* to anyone to reject their 
too-similar name, and will in fact do them good.  Today, I almost 
created a package called "Aspects".  Had I done so, and uploaded it 
to the Cheeseshop, I wouldn't have been warned that there is already 
a package named "aspects".  I would have been well on my way to 
creating confusion that would be entirely avoidable, were the 
Cheeseshop to stop me at the point of registration or uploading.

Since the restriction can cause no real harm, and produces a net 
good, but the lack of restriction can cause real harm (e.g., I had to 
later change a package name, thereby breaking dependencies in other 
packages), there is no reason *not* to provide that benefit to the 
users, and protect them from that harm.

Perhaps, as Jim says, it is time to start treating PyPI as part of 
the packaging system.  It is so in fact, anyway.  Meanwhile, the 
separation between cataloging and packaging means other issues, such 
as the complete disconnect between the cataloging of metadata and the 
automated production and use of such metadata.  The PKG-INFO format 
has been degrading with each new version, in terms of defining more 
metadata for which over-restrictive *syntax* is defined, while being 
almost completely lacking in any *semantics*.

This schism between the idea of neatly cataloging things, versus 
being able to actually *use* that cataloging for practical purposes 
by automated tools (as opposed to being usable only to human 
readers), seems to be at the heart of some of the current discussion.


From martin at v.loewis.de  Wed Jul 11 20:46:02 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 20:46:02 +0200
Subject: [Catalog-sig] cheeseshop outage
In-Reply-To: <9cee7ab80707111042w68b5c8e7sf220dc2cf4011bfd@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>	<4693FE94.6090107@v.loewis.de>
	<469446A2.9070500@pangalactic.us>	<46946A69.4000702@v.loewis.de>	<bf7b44d50707110728t467356a6t7ee489c2cea1af7d@mail.gmail.com>
	<9cee7ab80707111042w68b5c8e7sf220dc2cf4011bfd@mail.gmail.com>
Message-ID: <4695256A.5020208@v.loewis.de>

Fred Drake schrieb:
> On 7/11/07, Nathan R. Yergler <nathan at creativecommons.org> wrote:
>> The speed has noticeably improved (thanks!) but as recently as Monday
>> PyPI was unresponsive and then returning proxy errors.  It definitely
>> caused us (Creative Commons) to lose productivity Monday afternoon
>> (PDT).
> 
> We're seeing this right now, too.  I'm checking both www.python.org
> and cheeseshop.python.org.

If www.python.org is up, should be safe to ignore. If you can find any
post-mortem evidence on ximinez, that would be much appreciated.

Regards,
Martin

P.S. Why is www.python.org proxying for ximinez? Shouldn't it perform
redirects instead?

From benji at benjiyork.com  Wed Jul 11 20:52:50 2007
From: benji at benjiyork.com (Benji York)
Date: Wed, 11 Jul 2007 14:52:50 -0400
Subject: [Catalog-sig] start on static generation,
 and caching -     apache config.
In-Reply-To: <20070711184549.733CE3A404D@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>	<46931A3A.5000703@v.loewis.de>	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>	<4693FA2A.3020107@v.loewis.de>	<20070710221547.4A3043A40A4@sparrow.telecommunity.com>	<469467AA.7070409@v.loewis.de>
	<20070711184549.733CE3A404D@sparrow.telecommunity.com>
Message-ID: <46952702.8060606@benjiyork.com>

Phillip J. Eby wrote:
> This schism between the idea of neatly cataloging things, versus 
> being able to actually *use* that cataloging for practical purposes 
> by automated tools (as opposed to being usable only to human 
> readers), seems to be at the heart of some of the current discussion.

Wasn't there a proposal to merge the catalog-sig and distutils-sig?
-- 
Benji York
http://benjiyork.com

From jim at zope.com  Wed Jul 11 20:57:43 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 14:57:43 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -    apache config.
In-Reply-To: <4695241D.3090203@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
	<721297D4-85EA-4397-84C9-D90E5598477A@zope.com>
	<4695241D.3090203@v.loewis.de>
Message-ID: <0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com>


On Jul 11, 2007, at 2:40 PM, Martin v. L?wis wrote:

>> The .3-second times per request is *much* better than we had before
>> (I assume), but it's *not fast enough*.  The demand on the package
>> index used by setuptools is going to increase substantially.  Even if
>> setuptools only made a single request per package, .3 seconds per
>> request is too slow.  Given the current structure of the index,
>> setuptools has to make a request for the package and a request per
>> release.  For ZODB, this means about 12 requests, or more than 3
>> seconds.  Of course, this will increase over time, as more releases
>> are made.
>
> This I still don't understand. Why does it need to query all available
> releases?

The way that setuptools currently works, it scans each of the release  
pages looking for distributions.  In theory, it could take the names  
of these pages into account and scan fewer.  It will still have to  
scan at least 2.

I have a feeling that I'll never convince you that a third of a  
second is too slow.  I think I'll stop trying.  Hopefully, Ren?, will  
be able to get baking working, at which point the pages will be a lot  
faster.  At that point, I think it would be good to pursue alternate  
pages more optimized for setuptools to reduce the number and size of  
setuptools requests.   I'll help any way I can with that.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From pje at telecommunity.com  Wed Jul 11 21:03:12 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 11 Jul 2007 15:03:12 -0400
Subject: [Catalog-sig] start on static generation,
 and caching -  apache config.
In-Reply-To: <469520B6.2030002@benjiyork.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
	<4693FE94.6090107@v.loewis.de>
	<CFA0A89E-0885-4084-8BD9-92868EF21B63@zope.com>
	<46946871.3060100@v.loewis.de> <4694C8B6.1030804@benjiyork.com>
	<46951B17.4000104@v.loewis.de> <469520B6.2030002@benjiyork.com>
Message-ID: <20070711190058.2322F3A404D@sparrow.telecommunity.com>

At 02:25 PM 7/11/2007 -0400, Benji York wrote:
>Martin v. L?wis wrote:
> > Benji York schrieb:
>
> >> Is your position that PyPI isn't down/very slow on occasion or that when
> >> it is no one complains?
> >
> > Both. I believe it shouldn't be down
>
>The cheeseshop has provided its own proof that that believe is mistaken
>by being down as I began composing this message. <wink>
>
> > Jim Fulton complained that it took 0.3s to
> > get a single package's page, which I cannot classify as "very slow".
>
>During a single run setuptools or zc.buildout may make hundreds of
>requests to the cheeseshop taking a total time in the minutes.  That's
>not fast enough.  I can't see a technical reason why these requests
>couldn't be handled much faster than 3 a second.

An interesting thought for future optimization...  an XML-RPC catalog 
server designed for this use case could in fact do all the 
computation server-side, resolving dependencies and evaluating 
version constraints.  Heck, in theory, it could cache packages' 
external links, and simply hand back to the caller a complete list of 
candidate URLs to choose for downloading.  That way, most activities 
would take only one server round-trip to complete, if the client sent 
a list of everything it expects to need, and the server includes 
everything that the server expects the client to want due to those 
things' dependencies.

The main obstacle to implementing such a service today, is that it 
would have no way of knowing what dependencies to look for, without 
sniffing the contents of .egg files.  But, as long as a superset of 
possible dependencies was listed in PKG-INFO, the server could make 
intelligent guesses about what other packages are likely to be 
needed, and return their version/download info as well.  Returning 
information for packages that turn out not to be needed is likely to 
be far less expensive than having to make round-trip requests.

An alternative to providing that information from metadata, of 
course, would be for the client to include a "referrer" header of 
sorts, saying why it is asking for a package.  The server could then 
simply "learn" the relevant associations.



From jim at zope.com  Wed Jul 11 21:08:03 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 15:08:03 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -  apache config.
In-Reply-To: <20070711190058.2322F3A404D@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
	<4693FE94.6090107@v.loewis.de>
	<CFA0A89E-0885-4084-8BD9-92868EF21B63@zope.com>
	<46946871.3060100@v.loewis.de> <4694C8B6.1030804@benjiyork.com>
	<46951B17.4000104@v.loewis.de> <469520B6.2030002@benjiyork.com>
	<20070711190058.2322F3A404D@sparrow.telecommunity.com>
Message-ID: <9EE8B28D-5B16-4AE8-8001-E3ECCC34A199@zope.com>


On Jul 11, 2007, at 3:03 PM, Phillip J. Eby wrote:

> At 02:25 PM 7/11/2007 -0400, Benji York wrote:
>> Martin v. L?wis wrote:
>>> Benji York schrieb:
>>
>>>> Is your position that PyPI isn't down/very slow on occasion or  
>>>> that when
>>>> it is no one complains?
>>>
>>> Both. I believe it shouldn't be down
>>
>> The cheeseshop has provided its own proof that that believe is  
>> mistaken
>> by being down as I began composing this message. <wink>
>>
>>> Jim Fulton complained that it took 0.3s to
>>> get a single package's page, which I cannot classify as "very slow".
>>
>> During a single run setuptools or zc.buildout may make hundreds of
>> requests to the cheeseshop taking a total time in the minutes.   
>> That's
>> not fast enough.  I can't see a technical reason why these requests
>> couldn't be handled much faster than 3 a second.
>
> An interesting thought for future optimization...  an XML-RPC catalog
> server designed for this use case could in fact do all the
> computation server-side, resolving dependencies and evaluating
> version constraints.  Heck, in theory, it could cache packages'
> external links, and simply hand back to the caller a complete list of
> candidate URLs to choose for downloading.  That way, most activities
> would take only one server round-trip to complete, if the client sent
> a list of everything it expects to need, and the server includes
> everything that the server expects the client to want due to those
> things' dependencies.

That wouldn't help when local (e.g. development) or private  
distributions need to be included in the mix.

I think collecting all of the links for a package that PYPI knows  
about on individual package pages would go a very long way to  
reducing the number of requests.  If these pages were served  
statically (or in similar times), then I think we'd be in very good  
shape.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From martin at v.loewis.de  Wed Jul 11 21:13:42 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 21:13:42 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -   apache config.
In-Reply-To: <4695245C.3020703@benjiyork.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>	<46931A3A.5000703@v.loewis.de>	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>	<4693FA2A.3020107@v.loewis.de>	<069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
	<469468C5.8000906@v.loewis.de> <4694D028.6050203@benjiyork.com>
	<46952236.30704@v.loewis.de> <4695245C.3020703@benjiyork.com>
Message-ID: <46952BE6.1070604@v.loewis.de>

Benji York schrieb:
> Martin v. L?wis wrote:
>>> Although it wasn't part of the cheeseshop's original mission, it has
>>> become an integral part of distributing Python packages.  If it doesn't
>>> want to participate in its new-found utility, other options need to be
>>> explored.
>>
>> It's a software system; it doesn't have a mission.
> 
> This SIG has a mission, I was under the impression that the cheeseshop
> was developed to forward that mission.

That's true. That mission is "The Python Catalog SIG aims at producing a
master index of Python software and other resources."

I think this still is the mission - be *the* central site for indexing
Python software. The part "other resources" apparently never was
considered; it only indexes software now.

>> I just dislike making unilateral decisions.
> 
> Fortunately you don't have to.  We have several people here with varied
> experience that have the facilities to communicate their desires and
> expertise.

Ok. Of course, here the usual software engineer's reaction comes into
play: if you don't think something is that important, you try to come
up with reasons not doing it. I should have been more open: I don't
see that I have time to implement the clashing check that Phillip
proposed, although I'll see what I can do about the redirect
on lookup.

Regards,
Martin

From martin at v.loewis.de  Wed Jul 11 21:23:51 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 21:23:51 +0200
Subject: [Catalog-sig] The purpose(s) of PYPI
In-Reply-To: <C91F9213-4ED8-4BF4-BBA0-4C06245BEC90@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>	<46931A3A.5000703@v.loewis.de>	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>	<4693FA2A.3020107@v.loewis.de>
	<069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
	<469468C5.8000906@v.loewis.de>
	<F4062140-DA6C-4F8C-A782-0027128BFED2@zope.com>
	<46952349.5050606@v.loewis.de>
	<C91F9213-4ED8-4BF4-BBA0-4C06245BEC90@zope.com>
Message-ID: <46952E47.8020700@v.loewis.de>

> Can we agree that it is part of the purpose of PyPI to serve as a
> repository for setuptools?  I'd like to resolve this issue.  If it isn't
> part of PyPI's purpose to serve as a repository for setuptools, then
> we'll build another system that *does* have that purpose.  If it is part
> of the purpose to serve as a repository for setuptools, then we'll need
> to take various needs of setuptools into account.

I can't answer that question. I know PyPI is a master index of Python
software and other resources, because (as Benji York kindly reminded
me) that's the mission under which it was created.

Beyond that, it is what the community makes it to be. I personally know
it is not a "repository for setuptools" for *me*, as I don't use
setuptools. I also know it is a "repository for setuptools" for you,
as you have reported using it for that purpose. For many of the package
authors, I think it is a platform to advertise their software; for
some, it is also a web hosting service to place their released files
onto.

As for taking needs into account: First of all, it's a volunteer
project. Open source contributors are known to primarily scratch
their own itches. So if you want to see needs be taken into account,
you may have to write the code yourself, pay somebody to write
it for your, or talk somebody into writing it for you. In particular,
I personally won't write any line of code just because of a threat to
go away and write a competing index. Instead, my reaction to such
a threat remains the same: good luck!

Regards,
Martin

From benji at benjiyork.com  Wed Jul 11 21:27:45 2007
From: benji at benjiyork.com (Benji York)
Date: Wed, 11 Jul 2007 15:27:45 -0400
Subject: [Catalog-sig] start on static generation,
 and caching -   apache config.
In-Reply-To: <46952BE6.1070604@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>	<46931A3A.5000703@v.loewis.de>	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>	<4693FA2A.3020107@v.loewis.de>	<069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
	<469468C5.8000906@v.loewis.de> <4694D028.6050203@benjiyork.com>
	<46952236.30704@v.loewis.de> <4695245C.3020703@benjiyork.com>
	<46952BE6.1070604@v.loewis.de>
Message-ID: <46952F31.5020806@benjiyork.com>

Martin v. L?wis wrote:
> That's true. That mission is "The Python Catalog SIG aims at producing a
> master index of Python software and other resources."
> 
> I think this still is the mission - be *the* central site for indexing
> Python software. The part "other resources" apparently never was
> considered; it only indexes software now.

There exists ambiguity as to the audience for the index.  Humans are 
assumed; I propose that packaging systems need to be on the list as well.

> I should have been more open: I don't
> see that I have time to implement the clashing check that Phillip
> proposed, although I'll see what I can do about the redirect
> on lookup.

Knowing your motivation helps.  I don't think anyone expected you to 
jump on the implementation.  It's OK to say that you don't have time to 
implement something.  There are other people that can help, and if not 
it'll just have to wait.  We have to make sure we distinguish between 
desirability and feasibility.
-- 
Benji York
http://benjiyork.com

From pje at telecommunity.com  Wed Jul 11 21:25:44 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 11 Jul 2007 15:25:44 -0400
Subject: [Catalog-sig] start on static generation,
 and caching -      apache config.
In-Reply-To: <46952702.8060606@benjiyork.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
	<46931A3A.5000703@v.loewis.de>
	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
	<4693FA2A.3020107@v.loewis.de>
	<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
	<469467AA.7070409@v.loewis.de>
	<20070711184549.733CE3A404D@sparrow.telecommunity.com>
	<46952702.8060606@benjiyork.com>
Message-ID: <20070711192751.A9FF33A404D@sparrow.telecommunity.com>

At 02:52 PM 7/11/2007 -0400, Benji York wrote:
>Phillip J. Eby wrote:
>>This schism between the idea of neatly cataloging things, versus 
>>being able to actually *use* that cataloging for practical purposes 
>>by automated tools (as opposed to being usable only to human 
>>readers), seems to be at the heart of some of the current discussion.
>
>Wasn't there a proposal to merge the catalog-sig and distutils-sig?

Merging the lists isn't going to merge the people or change anybody's 
point of view.  The difference in SIGs reflects, for the most part, a 
difference in Special Interest -- the "I" in SIG.

Or another way of looking at the "I" is "Itch".  The people who have 
been working on cataloging already have their itch basically 
scratched; PyPI has been sufficient for their needs for some time now.

The packaging people, OTOH, have an ever-increasing itch, as 
setuptools hits its "hockey stick" growth phase both in user volume 
and package volume.  This is understandably, of little interest to 
people who don't do lots of packaging, deployment, and distribution.

I absolutely don't want to disparage the good folks who have made 
PyPI what it is today, and I totally understand their not wanting to 
take on the burden of supporting a tool they don't use or care about 
themselves, just because it happens to use PyPI.

But it seems to me that for folks whose Interest/Itch is not merely 
finding packages, but *using* them, a different infrastructure is 
needed, treating PyPI as the ultimate *source* of the information, 
without being also its sole *distribution* point, or query interface.

There are plenty of folks who have offered to spend funds, provide 
hosting, etc. for PyPI mirrors or alternatives -- perhaps we should 
create a SIG to start figuring out *how* to provide that, ideally 
while creating the least amount of additional service burden on the Cheeseshop.

Ideally, we could then support having the Cheeseshop redirect 
existing clients to a nearby distribution index, while newer clients 
could use a distribution index to start with.

Such a discussion would need to resolve certain design tradeoffs such 
as speed and availability vs. freshness of the index vs. load on the 
primary Cheeseshop vs. ability to have lots of mirrors/distribution 
indexes vs. ease of selecting one, etc.

But I believe the main reason why such discussion hasn't gone very 
far at this point is because the packaging-interest folks have been 
looking to the cataloging-interest folks to provide direction and 
focus to the discussion of the tradeoffs, even though these things 
lie mostly outside their itch/interest.  I think it is more likely to 
be productive for the packaging-interest folks to get clear about 
what they want first, and then the cataloging-interest folks can 
chime in if they see something being proposed that might be 
especially harmful to the Cheeseshop's availability or performance.


From jim at zope.com  Wed Jul 11 21:29:23 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 15:29:23 -0400
Subject: [Catalog-sig] The purpose(s) of PYPI
In-Reply-To: <46952E47.8020700@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>	<46931A3A.5000703@v.loewis.de>	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>	<4693FA2A.3020107@v.loewis.de>
	<069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
	<469468C5.8000906@v.loewis.de>
	<F4062140-DA6C-4F8C-A782-0027128BFED2@zope.com>
	<46952349.5050606@v.loewis.de>
	<C91F9213-4ED8-4BF4-BBA0-4C06245BEC90@zope.com>
	<46952E47.8020700@v.loewis.de>
Message-ID: <5713FCF2-E2B4-4599-A36B-3CF418A1CCDF@zope.com>


On Jul 11, 2007, at 3:23 PM, Martin v. L?wis wrote:
...
> As for taking needs into account: First of all, it's a volunteer
> project. Open source contributors are known to primarily scratch
> their own itches.

Thank you for explaining open source to me.

> So if you want to see needs be taken into account,
> you may have to write the code yourself, pay somebody to write
> it for your, or talk somebody into writing it for you.

Yup. I'm aware of that.

> In particular,
> I personally won't write any line of code just because of a threat to
> go away and write a competing index.

First, I'm not aware that anyone has asked you do do anything.

Second, I certainly meant no threat.  We need a working index to use  
with setuptools.  I would hope, in the spirit of open source to  
collaborate on that.  A basic questions that needs to be answered is  
whether to use PyPI or to build something else.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From martin at v.loewis.de  Wed Jul 11 21:41:59 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 21:41:59 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -    apache config.
In-Reply-To: <0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
	<721297D4-85EA-4397-84C9-D90E5598477A@zope.com>
	<4695241D.3090203@v.loewis.de>
	<0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com>
Message-ID: <46953287.8020702@v.loewis.de>

>> This I still don't understand. Why does it need to query all available
>> releases?
> 
> The way that setuptools currently works, it scans each of the release
> pages looking for distributions.  In theory, it could take the names of
> these pages into account and scan fewer.  It will still have to scan at
> least 2.

Can you elaborate please? Why does it need to find distributions for
versions that it will eventually not download?

> I have a feeling that I'll never convince you that a third of a second
> is too slow. 

That's likely, yes.

> to get baking working, at which point the pages will be a lot faster. 
> At that point, I think it would be good to pursue alternate pages more
> optimized for setuptools to reduce the number and size of setuptools
> requests.   I'll help any way I can with that.

Deal: please provide sample pages for some of the packages (starting
with some zc packages perhaps), plus a directory structure in which
they should live.

I'll put them up on ximinez, at (say) /raw (or /simple, or
whatever URL people propose), so that one can experiment with
whether they look right.

Then somebody else can write a generator to populate that; I
will at the earliest point when I have time (which won't be
before August), unless somebody does it earlier.

Regards,
Martin

From martin at v.loewis.de  Wed Jul 11 21:53:04 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 21:53:04 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -  apache config.
In-Reply-To: <20070711190058.2322F3A404D@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
	<4693FE94.6090107@v.loewis.de>
	<CFA0A89E-0885-4084-8BD9-92868EF21B63@zope.com>
	<46946871.3060100@v.loewis.de> <4694C8B6.1030804@benjiyork.com>
	<46951B17.4000104@v.loewis.de> <469520B6.2030002@benjiyork.com>
	<20070711190058.2322F3A404D@sparrow.telecommunity.com>
Message-ID: <46953520.4080106@v.loewis.de>

> An interesting thought for future optimization...  an XML-RPC catalog
> server designed for this use case could in fact do all the computation
> server-side, resolving dependencies and evaluating version constraints. 
> Heck, in theory, it could cache packages' external links, and simply
> hand back to the caller a complete list of candidate URLs to choose for
> downloading.

You mean something like

select f.filename from release_files f,releases r where
f.name='setuptools' and f.name=r.name and f.version=r.version and not
r._pypi_hidden;

This gives

             filename
----------------------------------
 setuptools-0.6c5.win32-py2.3.exe
 setuptools-0.6c5-py2.3.egg
 setuptools-0.6c5.win32-py2.4.exe
 setuptools-0.6c5-1.src.rpm
 setuptools-0.6c5.win32-py2.5.exe
 setuptools-0.6c5.tar.gz
 setuptools-0.6c5-py2.5.egg
 setuptools-0.6c5-py2.4.egg


That would be very easy to add to the RPC server, and
would be quite efficient also.

> That way, most activities would take only one server
> round-trip to complete, if the client sent a list of everything it
> expects to need, and the server includes everything that the server
> expects the client to want due to those things' dependencies.
>
> The main obstacle to implementing such a service today, is that it would
> have no way of knowing what dependencies to look for, without sniffing
> the contents of .egg files.

For that, I would definitely need code contributions.

Regards,
Martin

From jim at zope.com  Wed Jul 11 21:57:47 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 15:57:47 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -    apache config.
In-Reply-To: <46953287.8020702@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
	<721297D4-85EA-4397-84C9-D90E5598477A@zope.com>
	<4695241D.3090203@v.loewis.de>
	<0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com>
	<46953287.8020702@v.loewis.de>
Message-ID: <EBF5DA56-EF5C-47C9-9AEB-8FD071C56404@zope.com>


On Jul 11, 2007, at 3:41 PM, Martin v. L?wis wrote:

>>> This I still don't understand. Why does it need to query all  
>>> available
>>> releases?
>>
>> The way that setuptools currently works, it scans each of the release
>> pages looking for distributions.  In theory, it could take the  
>> names of
>> these pages into account and scan fewer.  It will still have to  
>> scan at
>> least 2.
>
> Can you elaborate please? Why does it need to find distributions for
> versions that it will eventually not download?

It just scans the package page for URLs.  It doesn't really know that  
the release pages correspond to a particular version.

Let's suppose that setuptools was changed to be aware that PyPI  
release pages correspond to a particular version.  In that case, it  
would have to read the package page to discover the release pages and  
then it would have to read at least one release page.  If it had  
requirements other than the version (e.g. Python version or  
platform), it might have to scan several releases to find an  
acceptable distribution.  But, in the best case, it would have to  
scan at least two pages.

...

>> to get baking working, at which point the pages will be a lot faster.
>> At that point, I think it would be good to pursue alternate pages  
>> more
>> optimized for setuptools to reduce the number and size of setuptools
>> requests.   I'll help any way I can with that.
>
> Deal: please provide sample pages for some of the packages (starting
> with some zc packages perhaps), plus a directory structure in which
> they should live.

Fair enough.  I'll do that.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From martin at v.loewis.de  Wed Jul 11 22:08:33 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 22:08:33 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -      apache config.
In-Reply-To: <20070711192751.A9FF33A404D@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
	<46931A3A.5000703@v.loewis.de>
	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
	<4693FA2A.3020107@v.loewis.de>
	<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
	<469467AA.7070409@v.loewis.de>
	<20070711184549.733CE3A404D@sparrow.telecommunity.com>
	<46952702.8060606@benjiyork.com>
	<20070711192751.A9FF33A404D@sparrow.telecommunity.com>
Message-ID: <469538C1.4050404@v.loewis.de>

> There are plenty of folks who have offered to spend funds, provide
> hosting, etc. for PyPI mirrors or alternatives -- perhaps we should
> create a SIG to start figuring out *how* to provide that, ideally while
> creating the least amount of additional service burden on the Cheeseshop.

This makes me suspicious. I can certainly believe that you may need more
sheer processing power, or more bandwidth, for such a system than the
current PyPI installation has to offer.

What I don't see why you need to implement something *different*. If
you need better queries - fine, add them to PyPI. If you need
replication, load balancing, etc, please add it to PyPI. If you
have a way faster machine, migrate PyPI to that machine. That
is all possible, but assumes availability of volunteers. However,
the approach "let's create a different system" *also* needs
volunteers. So I'd rather have these volunteers contribute to
a single system, instead of each of them building their own one.

With the particular offer of a faster machine, *all* it needs
is a volunteer who first migrates and then maintains the
installation. Of course, that would involve responsibility for
all of PyPI (i.e. also dealing with abandoned packages that
somebody else takes over, adding new classifiers, etc) (I
say that because that aspect also lacks volunteers in the
current installation).

Regards,
Martin


From martin at v.loewis.de  Wed Jul 11 22:11:09 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 22:11:09 +0200
Subject: [Catalog-sig] The purpose(s) of PYPI
In-Reply-To: <5713FCF2-E2B4-4599-A36B-3CF418A1CCDF@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>	<46931A3A.5000703@v.loewis.de>	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>	<4693FA2A.3020107@v.loewis.de>
	<069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
	<469468C5.8000906@v.loewis.de>
	<F4062140-DA6C-4F8C-A782-0027128BFED2@zope.com>
	<46952349.5050606@v.loewis.de>
	<C91F9213-4ED8-4BF4-BBA0-4C06245BEC90@zope.com>
	<46952E47.8020700@v.loewis.de>
	<5713FCF2-E2B4-4599-A36B-3CF418A1CCDF@zope.com>
Message-ID: <4695395D.5030602@v.loewis.de>

> Second, I certainly meant no threat.  We need a working index to use
> with setuptools.  I would hope, in the spirit of open source to
> collaborate on that.  A basic questions that needs to be answered is
> whether to use PyPI or to build something else.

Ok. For this question, there is a seemingly-obvious answer: use PyPI.
Why on earth would somebody want to build something else?

Regards,
Martin

From martin at v.loewis.de  Wed Jul 11 22:15:44 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 22:15:44 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -    apache config.
In-Reply-To: <EBF5DA56-EF5C-47C9-9AEB-8FD071C56404@zope.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
	<721297D4-85EA-4397-84C9-D90E5598477A@zope.com>
	<4695241D.3090203@v.loewis.de>
	<0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com>
	<46953287.8020702@v.loewis.de>
	<EBF5DA56-EF5C-47C9-9AEB-8FD071C56404@zope.com>
Message-ID: <46953A70.6070600@v.loewis.de>

> Let's suppose that setuptools was changed to be aware that PyPI release
> pages correspond to a particular version.  In that case, it would have
> to read the package page to discover the release pages and then it would
> have to read at least one release page.  If it had requirements other
> than the version (e.g. Python version or platform), it might have to
> scan several releases to find an acceptable distribution.  But, in the
> best case, it would have to scan at least two pages.

Sure. However, that makes the difference between O(1) and O(N),
where N is the number of releases recorded. Going back to your
original concern: you would not have to change the policy of
keeping many different releases if the number of releases
does not impact performance.

When it looks for individual release pages, does it know that these
are release pages, or does it follow all links on the package
page? If the latter, what links does it follow (there are plenty
more on the package page)?

Regards,
Martin

From jim at zope.com  Wed Jul 11 22:18:08 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 16:18:08 -0400
Subject: [Catalog-sig] The purpose(s) of PYPI
In-Reply-To: <4695395D.5030602@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>	<46931A3A.5000703@v.loewis.de>	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>	<4693FA2A.3020107@v.loewis.de>
	<069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>
	<469468C5.8000906@v.loewis.de>
	<F4062140-DA6C-4F8C-A782-0027128BFED2@zope.com>
	<46952349.5050606@v.loewis.de>
	<C91F9213-4ED8-4BF4-BBA0-4C06245BEC90@zope.com>
	<46952E47.8020700@v.loewis.de>
	<5713FCF2-E2B4-4599-A36B-3CF418A1CCDF@zope.com>
	<4695395D.5030602@v.loewis.de>
Message-ID: <86EDAB6A-62C4-437C-82CD-34242258472C@zope.com>


On Jul 11, 2007, at 4:11 PM, Martin v. L?wis wrote:

>> Second, I certainly meant no threat.  We need a working index to use
>> with setuptools.  I would hope, in the spirit of open source to
>> collaborate on that.  A basic questions that needs to be answered is
>> whether to use PyPI or to build something else.
>
> Ok. For this question, there is a seemingly-obvious answer: use PyPI.
> Why on earth would somebody want to build something else?

If we can make PyPI do what we (where "we" doesn't have to include  
"you") need, then there is no reason.

I don't want to shove a bunch of requirements down someone's throat.   
I understand that you don't object to new requirements if you don't  
have to be responsible for them. That's perfectly fair.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From benji at benjiyork.com  Wed Jul 11 22:22:09 2007
From: benji at benjiyork.com (Benji York)
Date: Wed, 11 Jul 2007 16:22:09 -0400
Subject: [Catalog-sig] The purpose(s) of PYPI
In-Reply-To: <4695395D.5030602@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>	<46931A3A.5000703@v.loewis.de>	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>	<4693FA2A.3020107@v.loewis.de>	<069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com>	<469468C5.8000906@v.loewis.de>	<F4062140-DA6C-4F8C-A782-0027128BFED2@zope.com>	<46952349.5050606@v.loewis.de>	<C91F9213-4ED8-4BF4-BBA0-4C06245BEC90@zope.com>	<46952E47.8020700@v.loewis.de>	<5713FCF2-E2B4-4599-A36B-3CF418A1CCDF@zope.com>
	<4695395D.5030602@v.loewis.de>
Message-ID: <46953BF1.2020905@benjiyork.com>

Martin v. L?wis wrote:
>> Second, I certainly meant no threat.  We need a working index to use
>> with setuptools.  I would hope, in the spirit of open source to
>> collaborate on that.  A basic questions that needs to be answered is
>> whether to use PyPI or to build something else.
> 
> Ok. For this question, there is a seemingly-obvious answer: use PyPI.
> Why on earth would somebody want to build something else?

Great; now that we've established that PyPI's audience will include 
setuptools, the people who know what it wants can make (or reiterate) 
proposals.
-- 
Benji York
http://benjiyork.com

From jodok at lovelysystems.com  Wed Jul 11 23:15:43 2007
From: jodok at lovelysystems.com (Jodok Batlogg)
Date: Wed, 11 Jul 2007 21:15:43 +0000 GMT
Subject: [Catalog-sig] The purpose(s) of PYPI
In-Reply-To: <5713FCF2-E2B4-4599-A36B-3CF418A1CCDF@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>	<46931A3A.5000703@v.loewis.de>	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>	<4693FA2A.3020107@v.loewis.de><069F2A59-78E3-4EE6-B3D9-22327A4ED25D@zope.com><469468C5.8000906@v.loewis.de><F4062140-DA6C-4F8C-A782-0027128BFED2@zope.com><46952349.5050606@v.loewis.de><C91F9213-4ED8-4BF4-BBA0-4C06245BEC90@zope.com><46952E47.8020700@v.loewis.de>
	<5713FCF2-E2B4-4599-A36B-3CF418A1CCDF@zope.com>
Message-ID: <1827602359-1184184972-cardhu_blackberry.rim.net-22952-@engine37-cell01.bwc.produk.on.blackberry>

+1 on all you said jim
--
Lovely Systems, Partner

phone: +43 5572 908060, fax: +43 5572 908060-77
Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria

-----Original Message-----
From: Jim Fulton <jim at zope.com>
Date: Wed, 11 Jul 2007 15:29:23 
To: "Martin v. L?wis"  <martin at v.loewis.de>
Cc:catalog-sig at python.org
Subject: Re: [Catalog-sig] The purpose(s) of PYPI


On Jul 11, 2007, at 3:23 PM, Martin v. L?wis wrote:
...
> As for taking needs into account: First of all, it's a volunteer
> project. Open source contributors are known to primarily scratch
> their own itches.

Thank you for explaining open source to me.

> So if you want to see needs be taken into account,
> you may have to write the code yourself, pay somebody to write
> it for your, or talk somebody into writing it for you.

Yup. I'm aware of that.

> In particular,
> I personally won't write any line of code just because of a threat to
> go away and write a competing index.

First, I'm not aware that anyone has asked you do do anything.

Second, I certainly meant no threat.  We need a working index to use  
with setuptools.  I would hope, in the spirit of open source to  
collaborate on that.  A basic questions that needs to be answered is  
whether to use PyPI or to build something else.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org



_______________________________________________
Catalog-SIG mailing list
Catalog-SIG at python.org
http://mail.python.org/mailman/listinfo/catalog-sig

From jim at zope.com  Wed Jul 11 22:29:56 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 16:29:56 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -    apache config.
In-Reply-To: <46953A70.6070600@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
	<721297D4-85EA-4397-84C9-D90E5598477A@zope.com>
	<4695241D.3090203@v.loewis.de>
	<0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com>
	<46953287.8020702@v.loewis.de>
	<EBF5DA56-EF5C-47C9-9AEB-8FD071C56404@zope.com>
	<46953A70.6070600@v.loewis.de>
Message-ID: <ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>


On Jul 11, 2007, at 4:15 PM, Martin v. L?wis wrote:

>> Let's suppose that setuptools was changed to be aware that PyPI  
>> release
>> pages correspond to a particular version.  In that case, it would  
>> have
>> to read the package page to discover the release pages and then it  
>> would
>> have to read at least one release page.  If it had requirements other
>> than the version (e.g. Python version or platform), it might have to
>> scan several releases to find an acceptable distribution.  But, in  
>> the
>> best case, it would have to scan at least two pages.
>
> Sure. However, that makes the difference between O(1) and O(N),
> where N is the number of releases recorded. Going back to your
> original concern: you would not have to change the policy of
> keeping many different releases if the number of releases
> does not impact performance.

Yup.  Absolutely.  That's why it we should change the index or  
setuptools, or both.  IMO, it makes the most sense to change the  
index to have setuptools specific pages, in addition to the ones for  
humans, that allow:

- One page per package and

- a minimal amount of data to be downloaded and scanned per page.

   (As I noted before, release pages are meant for humans.  They  
sometimes contain *lots* of data that setuptools doesn't need.)

> When it looks for individual release pages, does it know that these
> are release pages, or does it follow all links on the package
> page?

I'll have to dig to answer that question precisely.  I'll do that  
after pausing to see if Phillip explains it first.

> If the latter, what links does it follow (there are plenty
> more on the package page)?

See: http://mail.python.org/pipermail/catalog-sig/2007-July/001217.html

It seems to only scan the release pages.  So it has some heuristic to  
know which links to follow.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From martin at v.loewis.de  Wed Jul 11 22:43:41 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jul 2007 22:43:41 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -    apache config.
In-Reply-To: <ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
	<721297D4-85EA-4397-84C9-D90E5598477A@zope.com>
	<4695241D.3090203@v.loewis.de>
	<0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com>
	<46953287.8020702@v.loewis.de>
	<EBF5DA56-EF5C-47C9-9AEB-8FD071C56404@zope.com>
	<46953A70.6070600@v.loewis.de>
	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>
Message-ID: <469540FD.5060109@v.loewis.de>

>> If the latter, what links does it follow (there are plenty
>> more on the package page)?
> 
> See: http://mail.python.org/pipermail/catalog-sig/2007-July/001217.html
> 
> It seems to only scan the release pages.  So it has some heuristic to
> know which links to follow.

Looking at

http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api

tells me that it always expects that release pages have the form
base/projectname/version.

This looks like a formal specification of PyPI, so I wonder why it
then would not trust this specification more actively.

Regards,
Martin

From jim at zope.com  Wed Jul 11 22:55:55 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 16:55:55 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -    apache config.
In-Reply-To: <469540FD.5060109@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
	<721297D4-85EA-4397-84C9-D90E5598477A@zope.com>
	<4695241D.3090203@v.loewis.de>
	<0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com>
	<46953287.8020702@v.loewis.de>
	<EBF5DA56-EF5C-47C9-9AEB-8FD071C56404@zope.com>
	<46953A70.6070600@v.loewis.de>
	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>
	<469540FD.5060109@v.loewis.de>
Message-ID: <05004547-983F-4192-8FA6-7D0A05D6155C@zope.com>


On Jul 11, 2007, at 4:43 PM, Martin v. L?wis wrote:

>>> If the latter, what links does it follow (there are plenty
>>> more on the package page)?
>>
>> See: http://mail.python.org/pipermail/catalog-sig/2007-July/ 
>> 001217.html
>>
>> It seems to only scan the release pages.  So it has some heuristic to
>> know which links to follow.
>
> Looking at
>
> http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api
>
> tells me that it always expects that release pages have the form
> base/projectname/version.
>
> This looks like a formal specification of PyPI, so I wonder why it
> then would not trust this specification more actively.

<shrug>  Phillip has certainly said it could.

IMO, it wouldn't really matter if the pages used by setuptools were  
specialized for it. Compared with changing setuptools to be more  
clever in its handling of release pages, providing custom pages for  
setuptools will reduce the number of requests by at least 50% and  
sometimes much more and will greatly reduce the amount of data that  
needs to be downloaded and scanned.  Someone will need to modify some  
software in either case, so the custom index pages look like a big  
win to me.

I'll take a stab at writing a module, probably using setuptools  
itself, to scan the existing package and release pages to generate  
the sort of pages I'm talking about.  This can be used to generate  
sample pages and might be useful for implementing the pages.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From richardjones at optusnet.com.au  Thu Jul 12 00:09:48 2007
From: richardjones at optusnet.com.au (Richard Jones)
Date: Thu, 12 Jul 2007 08:09:48 +1000
Subject: [Catalog-sig] start on static generation,
	and caching -    apache config.
In-Reply-To: <ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
	<46953A70.6070600@v.loewis.de>
	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>
Message-ID: <200707120809.48344.richardjones@optusnet.com.au>

On Thu, 12 Jul 2007, you wrote:
> Yup.  Absolutely.  That's why it we should change the index or
> setuptools, or both.  IMO, it makes the most sense to change the
> index to have setuptools specific pages, in addition to the ones for
> humans, that allow:

... you know about the XML-RPC interface, yes?

http://wiki.python.org/moin/CheeseShopXmlRpc

I never fully understood why setuptools went with HTML scraping instead of 
XML-RPC.


     Richard

From richardjones at optushome.com.au  Thu Jul 12 00:11:49 2007
From: richardjones at optushome.com.au (Richard Jones)
Date: Thu, 12 Jul 2007 08:11:49 +1000
Subject: [Catalog-sig] start on static generation,
	and caching -    apache config.
In-Reply-To: <469522D6.1070706@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
	<469522D6.1070706@v.loewis.de>
Message-ID: <200707120811.49824.richardjones@optushome.com.au>

On Thu, 12 Jul 2007, Martin v. L?wis wrote:
> > The questions for us is, how much effort we are willing to make to
> > prevent people from shooting themselves in the foot.  I can understand
> > why Phillip would like the package index to prevent people from choosing
> > problematic package names.
>
> That's not my understanding - the issue isn't with "problematic package
> names", but with conflicting package names. IOW, any single name is
> fine - it's a pair of names that would cause a problem (and only if
> you wanted to install both packages on the same system).

A big issue that's not been raised is that *distutils* have no package name 
rules, but it's being proposed that PyPI does - thus a package author will 
potentially get an error when uploading their package, and also the name that 
appears in the index may be quite different to the name of their package.


    Richard

From richardjones at optushome.com.au  Thu Jul 12 00:23:11 2007
From: richardjones at optushome.com.au (Richard Jones)
Date: Thu, 12 Jul 2007 08:23:11 +1000
Subject: [Catalog-sig] start on static generation,
	and caching -  apache config.
In-Reply-To: <20070711190058.2322F3A404D@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<469520B6.2030002@benjiyork.com>
	<20070711190058.2322F3A404D@sparrow.telecommunity.com>
Message-ID: <200707120823.12001.richardjones@optushome.com.au>

On Thu, 12 Jul 2007, Phillip J. Eby wrote:
> An interesting thought for future optimization...  an XML-RPC catalog
> server designed for this use case could in fact do all the
> computation server-side, resolving dependencies and evaluating
> version constraints.

Just to remind again: PyPI has an XML-RPC interface, and has had for a long 
time. It has a history of accepting any and all additional functions for that 
interface.


    Richard

ps. why is it I keep on reading this undercurrent of "pypi doesn't do exactly 
what we need, so let's write a new one" and not "let's just add some more 
functionality to pypi so it does exactly what we need"... Is there something 
written somewhere, or even implied, that PyPI is somehow a closed 
development? If there is, I really need to strongly reiterate - PyPI will 
*always* be completely open for new developers. Please see the wiki page 
http://wiki.python.org/moin/CheeseShopDev for further information.

From pje at telecommunity.com  Thu Jul 12 00:40:26 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 11 Jul 2007 18:40:26 -0400
Subject: [Catalog-sig] start on static generation,
 and caching -     apache config.
In-Reply-To: <200707120809.48344.richardjones@optusnet.com.au>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
	<46953A70.6070600@v.loewis.de>
	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>
	<200707120809.48344.richardjones@optusnet.com.au>
Message-ID: <20070711223812.D02D13A404D@sparrow.telecommunity.com>

At 08:09 AM 7/12/2007 +1000, Richard Jones wrote:
>On Thu, 12 Jul 2007, you wrote:
> > Yup.  Absolutely.  That's why it we should change the index or
> > setuptools, or both.  IMO, it makes the most sense to change the
> > index to have setuptools specific pages, in addition to the ones for
> > humans, that allow:
>
>... you know about the XML-RPC interface, yes?
>
>http://wiki.python.org/moin/CheeseShopXmlRpc
>
>I never fully understood why setuptools went with HTML scraping instead of
>XML-RPC.

Fundamentally, it was because the XML-RPC API did not then (and does 
not now) provide everything that's needed.  (As I mentioned a few of 
the other times you asked this.)  The API has improved and added some 
of the missing bits, but not all of them.

There are two pieces still missing:

1. Access to "hidden" packages' release info

2. Links in the long_description that are rendered by PyPI's web interface

Without #2, we can't pick up author-provided Subversion links; see:

   http://peak.telecommunity.com/DevCenter/setuptools#making-your-package-available-for-easyinstall

for details.

With this information, easy_install could be changed to use the 
XML-RPC API....  *but* it would make even *more* round-trips to PyPI 
than it does now, unless those APIs were also designed differently 
than the ones that exist now, because you would need at least one 
search to find the correct package and its PKG-INFO, and another 
search to get the download files.  Currently, it can at least get 
both of these in one trip, if the package name is an exact match.

To answer Martin's question of why setuptools doesn't "trust" the 
PyPI specification even more, it's because having chosen to use the 
web interface to get the information, I thought it prudent to use 
only that subset of the web interface that could be easily duplicated 
using simple Apache directory indexes, since that meant someone could 
create their own index or mirror a portion of PyPI without having to 
implement its entire feature set.  This later proved prudent when Jim 
wanted to have tests of his buildout framework that did not rely on 
PyPI being up, as it made it easier to create a mock PyPI for unit 
testing purposes.

To be honest, the one thing I did *not* anticipate in this design was 
that Jim would be making 20 releases of the same package available in 
"unhidden" form.  :)


From pje at telecommunity.com  Thu Jul 12 00:44:51 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 11 Jul 2007 18:44:51 -0400
Subject: [Catalog-sig] start on static generation,
 and caching -     apache config.
In-Reply-To: <200707120811.49824.richardjones@optushome.com.au>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
	<469522D6.1070706@v.loewis.de>
	<200707120811.49824.richardjones@optushome.com.au>
Message-ID: <20070711224237.511A73A404D@sparrow.telecommunity.com>

At 08:11 AM 7/12/2007 +1000, Richard Jones wrote:
>On Thu, 12 Jul 2007, Martin v. L?wis wrote:
> > > The questions for us is, how much effort we are willing to make to
> > > prevent people from shooting themselves in the foot.  I can understand
> > > why Phillip would like the package index to prevent people from choosing
> > > problematic package names.
> >
> > That's not my understanding - the issue isn't with "problematic package
> > names", but with conflicting package names. IOW, any single name is
> > fine - it's a pair of names that would cause a problem (and only if
> > you wanted to install both packages on the same system).
>
>A big issue that's not been raised is that *distutils* have no package name
>rules, but it's being proposed that PyPI does - thus a package author will
>potentially get an error when uploading their package,

That would happen now, if they spell their package exactly the same 
as somebody else's package.


>and also the name that
>appears in the index may be quite different to the name of their package.

No-one has proposed that PyPI *change* a package's name, only that 
one not be allowed to *add* a package whose name does not 
sufficiently differ from an existing package that it would have a 
different filename.

In other words, since someone has uploaded a package to the 
CheeseShop called "aspects", I should not be able to register a 
package called "Aspects" or "asPecTS".

If on the other hand I had registered a package named "Aspects" 
first, then the other person should not be able to create one called 
"aspects" or "ASPects".

So there is neither any changing of names, nor rejection of names on 
their own, but only a restriction as to how *similar* two names may be.


From martin at v.loewis.de  Thu Jul 12 00:51:55 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 12 Jul 2007 00:51:55 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -     apache config.
In-Reply-To: <20070711223812.D02D13A404D@sparrow.telecommunity.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
	<46953A70.6070600@v.loewis.de>
	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>
	<200707120809.48344.richardjones@optusnet.com.au>
	<20070711223812.D02D13A404D@sparrow.telecommunity.com>
Message-ID: <46955F0B.2060006@v.loewis.de>

> 1. Access to "hidden" packages' release info

Can you explain what you need them for, and when?

I don't fully understand _pypi_hidden, however, I thought that
a "hidden" release is really one that the author doesn't want
to be ever found, and that is maintained just because of old
clients know exactly where it is, and access it directly.

> 2. Links in the long_description that are rendered by PyPI's web interface

Just specify precisely what operation you want, and what precisely the
result should be, and it will appear (also for _pypi_hidden).

> With this information, easy_install could be changed to use the XML-RPC
> API....  *but* it would make even *more* round-trips to PyPI than it
> does now, unless those APIs were also designed differently than the ones
> that exist now, because you would need at least one search to find the
> correct package and its PKG-INFO, and another search to get the download
> files.  Currently, it can at least get both of these in one trip, if the
> package name is an exact match.

Ok, so can you design different APIs, reducing the number of roundtrips
to one in the common case, while simultaneously not requiring the server
to compute information that is not needed in the common case?

If you can, it will appear.

> To answer Martin's question of why setuptools doesn't "trust" the PyPI
> specification even more, it's because having chosen to use the web
> interface to get the information, I thought it prudent to use only that
> subset of the web interface that could be easily duplicated using simple
> Apache directory indexes, since that meant someone could create their
> own index or mirror a portion of PyPI without having to implement its
> entire feature set.  This later proved prudent when Jim wanted to have
> tests of his buildout framework that did not rely on PyPI being up, as
> it made it easier to create a mock PyPI for unit testing purposes.

I still don't understand. I'm talking about not accessing all
versions in /root/package/version, trusting that the last part
really is a version (i.e. reading only /root/package, finding
out all possible versions, selecting the best one, then reading
/root/package/bestversion).

I cannot see why this is unavailable in a straight directory
indexes. Correct me if I'm wrong, but I think you can have

/root/package/index.html
/root/package/version/index.html

and then still chose to make both index.html the same
(if there is only a single version), or list the individual
versions in the top-level index.html.

Or, you can just drop /root/package/index.html, trusting
that the Apache directory index will list the single
version subdirectory, anyway.

Regards,
Martin


From jim at zope.com  Thu Jul 12 00:51:51 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 18:51:51 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -    apache config.
In-Reply-To: <200707120809.48344.richardjones@optusnet.com.au>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
	<46953A70.6070600@v.loewis.de>
	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>
	<200707120809.48344.richardjones@optusnet.com.au>
Message-ID: <297846B8-94DC-4770-9476-711796E82FEC@zope.com>


On Jul 11, 2007, at 6:09 PM, Richard Jones wrote:

> On Thu, 12 Jul 2007, you wrote:
>> Yup.  Absolutely.  That's why it we should change the index or
>> setuptools, or both.  IMO, it makes the most sense to change the
>> index to have setuptools specific pages, in addition to the ones for
>> humans, that allow:
>
> ... you know about the XML-RPC interface, yes?

Yes.

>
> http://wiki.python.org/moin/CheeseShopXmlRpc
>
> I never fully understood why setuptools went with HTML scraping  
> instead of
> XML-RPC.

The main reason, as Phillip has explained is that he wants to allow  
static mirrors of the index.  Another good reason is to allow static  
implementation, which would be far more scalable in the long run.

Thanks for reminding me of this though as it will make my little  
project to prototype an alternate index format for setuptools easier. :)

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Thu Jul 12 00:56:01 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 18:56:01 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -  apache config.
In-Reply-To: <200707120823.12001.richardjones@optushome.com.au>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<469520B6.2030002@benjiyork.com>
	<20070711190058.2322F3A404D@sparrow.telecommunity.com>
	<200707120823.12001.richardjones@optushome.com.au>
Message-ID: <4198A946-0B11-4F19-9D99-CD7F7B4B9161@zope.com>


On Jul 11, 2007, at 6:23 PM, Richard Jones wrote:
...
> ps. why is it I keep on reading this undercurrent of "pypi doesn't  
> do exactly
> what we need, so let's write a new one" and not "let's just add  
> some more
> functionality to pypi so it does exactly what we need"... Is there  
> something
> written somewhere, or even implied, that PyPI is somehow a closed
> development? If there is, I really need to strongly reiterate -  
> PyPI will
> *always* be completely open for new developers. Please see the wiki  
> page
> http://wiki.python.org/moin/CheeseShopDev for further information.

I don't think anyone wants to write an alternative.  Well, maybe  
there are people like that, but you aren't reading them here.  Why  
would people spend time arguing about requirements, performance, etc,  
if they wanted to write their own.

Some people are being forced to implement their own indexes because  
they've become dependent on PyPI and PyPI just hasn't been there for  
them lately.  I'm pretty sure they don't want to maintain alternate  
indexes in the long term.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jeremy.kloth at 4suite.org  Thu Jul 12 01:20:49 2007
From: jeremy.kloth at 4suite.org (Jeremy Kloth)
Date: Wed, 11 Jul 2007 17:20:49 -0600
Subject: [Catalog-sig] start on static generation,
	and caching -     apache config.
In-Reply-To: <20070711223812.D02D13A404D@sparrow.telecommunity.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
	<200707120809.48344.richardjones@optusnet.com.au>
	<20070711223812.D02D13A404D@sparrow.telecommunity.com>
Message-ID: <200707111720.49299.jeremy.kloth@4suite.org>

On Wednesday 11 July 2007 4:40:26 pm Phillip J. Eby wrote:
> 1. Access to "hidden" packages' release info

This already exists. Simply call release_data() with the exact version you are 
interested in. It returns the metadata regardless of the "hidden" flag.

> 2. Links in the long_description that are rendered by PyPI's web interface

The 'description' key in the dictionary returned by release_data() contains 
the long_description as provided by the package's setup.py. I would think 
that scanning just that should be simpler than relying on particular 
formatting of the PyPI generated package page.

--
Jeremy Kloth
http://4suite.org/

From jim at zope.com  Thu Jul 12 01:32:21 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 19:32:21 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -    apache config.
In-Reply-To: <20070711223812.D02D13A404D@sparrow.telecommunity.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
	<46953A70.6070600@v.loewis.de>
	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>
	<200707120809.48344.richardjones@optusnet.com.au>
	<20070711223812.D02D13A404D@sparrow.telecommunity.com>
Message-ID: <484AE499-EB19-4831-9AFB-1BCC3FCE9249@zope.com>


On Jul 11, 2007, at 6:40 PM, Phillip J. Eby wrote:
...
> There are two pieces still missing:
>
> 1. Access to "hidden" packages' release info

I'm not sure what you are referring to here. Are you talking about  
hidden releases? Or something else?


> 2. Links in the long_description that are rendered by PyPI's web  
> interface
>
> Without #2, we can't pick up author-provided Subversion links; see:
>
>   http://peak.telecommunity.com/DevCenter/setuptools#making-your- 
> package-available-for-easyinstall
> for details.


AFAICT, the information is available in the output of the  
release_data method.

...

> To be honest, the one thing I did *not* anticipate in this design  
> was that Jim would be making 20 releases of the same package  
> available in "unhidden" form.  :)

I assume you understand why this is needed. (Or maybe it isn't needed  
and I'm missing something.)  We need to be able to depend on old  
versions and AFAICT, setuptools can't see hidden releases.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Thu Jul 12 01:46:45 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 19:46:45 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -    apache config.
In-Reply-To: <200707120811.49824.richardjones@optushome.com.au>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
	<469522D6.1070706@v.loewis.de>
	<200707120811.49824.richardjones@optushome.com.au>
Message-ID: <A0D02EC0-8AA8-478D-8071-782516ED4651@zope.com>


On Jul 11, 2007, at 6:11 PM, Richard Jones wrote:

> On Thu, 12 Jul 2007, Martin v. L?wis wrote:
>>> The questions for us is, how much effort we are willing to make to
>>> prevent people from shooting themselves in the foot.  I can  
>>> understand
>>> why Phillip would like the package index to prevent people from  
>>> choosing
>>> problematic package names.
>>
>> That's not my understanding - the issue isn't with "problematic  
>> package
>> names", but with conflicting package names. IOW, any single name is
>> fine - it's a pair of names that would cause a problem (and only if
>> you wanted to install both packages on the same system).
>
> A big issue that's not been raised is that *distutils* have no  
> package name
> rules, but it's being proposed that PyPI does - thus a package  
> author will
> potentially get an error when uploading their package, and also the  
> name that
> appears in the index may be quite different to the name of their  
> package.

Maybe distutils should have more package name rules than it does  
now.  We (the Community) should be free to change things based on  
experience.  We now have a lot more experience with this stuff than  
we had a few years ago.  Maybe we should consider a reset.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Thu Jul 12 01:47:59 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 11 Jul 2007 19:47:59 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -    apache config.
In-Reply-To: <297846B8-94DC-4770-9476-711796E82FEC@zope.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
	<46953A70.6070600@v.loewis.de>
	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>
	<200707120809.48344.richardjones@optusnet.com.au>
	<297846B8-94DC-4770-9476-711796E82FEC@zope.com>
Message-ID: <ABC0379F-7153-465E-88F3-6ECC919D6D99@zope.com>


On Jul 11, 2007, at 6:51 PM, Jim Fulton wrote:
> Another good reason is to allow static
> implementation, which would be far more scalable in the long run.

ATM, from my machine, xml-rpc requests to PyPI are taking about .27  
seconds.  This is only a little less than regular page requests.   
With the current API, It would require at best 3 requests to get all  
of the distribution URLs.  Presumably, with a change to the API, we  
could get this down to one request, but that's still a long time  
given the demand I expect on PyPI in the future.

It would be so much simpler to just publish a static page for each  
package that setuptools could parse.  I'll try to prototype this.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From waterbug at pangalactic.us  Thu Jul 12 03:17:00 2007
From: waterbug at pangalactic.us (Stephen Waterbury)
Date: Wed, 11 Jul 2007 21:17:00 -0400
Subject: [Catalog-sig] No more cc's please (was Re: start on static
 generation, and caching -    apache config.)
In-Reply-To: <05004547-983F-4192-8FA6-7D0A05D6155C@zope.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
	<721297D4-85EA-4397-84C9-D90E5598477A@zope.com>
	<4695241D.3090203@v.loewis.de>
	<0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com>
	<46953287.8020702@v.loewis.de>
	<EBF5DA56-EF5C-47C9-9AEB-8FD071C56404@zope.com>
	<46953A70.6070600@v.loewis.de>
	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>
	<469540FD.5060109@v.loewis.de>
	<05004547-983F-4192-8FA6-7D0A05D6155C@zope.com>
Message-ID: <4695810C.7070606@pangalactic.us>

Everyone:

Please exclude me from the cc's of all messages you send to the list!
I'm a *member* of the catalog-sig list, so I'm getting 2 copies of every
message in this thread and it's getting annoying.  I'm against all this
cc crap anyway -- that's why we have a *list*, dammit!  (Geez, one
would think Python programmers would be more email literate!  grumble.)

Thanks,
Steve

From martin at v.loewis.de  Thu Jul 12 07:11:50 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 12 Jul 2007 07:11:50 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -    apache config.
In-Reply-To: <ABC0379F-7153-465E-88F3-6ECC919D6D99@zope.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
	<46953A70.6070600@v.loewis.de>
	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>
	<200707120809.48344.richardjones@optusnet.com.au>
	<297846B8-94DC-4770-9476-711796E82FEC@zope.com>
	<ABC0379F-7153-465E-88F3-6ECC919D6D99@zope.com>
Message-ID: <4695B816.9020706@v.loewis.de>

> ATM, from my machine, xml-rpc requests to PyPI are taking about .27
> seconds.  This is only a little less than regular page requests.  With
> the current API, It would require at best 3 requests to get all of the
> distribution URLs.  Presumably, with a change to the API, we could get
> this down to one request, but that's still a long time given the demand
> I expect on PyPI in the future.

You seem to assume that if you see a round trip time of .27 seconds,
that then PyPI could only do 3 requests per second. That is not so.

I just logged onto www.python.org (a machine that is close to
cheeseshop.python.org), and called this function:

>>> s=xmlrpclib.ServerProxy("http://cheeseshop.python.org/pypi")
>>> def f():
...   start=time.time()
...   for i in range(1000):s.package_releases('setuptools')
...   return time.time()-start
...
>>> f()
7.6247878074645996

So it can currently do 130 XML-RPC requests per second, to
a single client. Inverting it, a request takes 0.0076s,
which is a lot less than 0.27s.

Regards,
Martin

From pje at telecommunity.com  Thu Jul 12 07:48:38 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 12 Jul 2007 01:48:38 -0400
Subject: [Catalog-sig] start on static generation,
 and caching -      apache config.
In-Reply-To: <200707111720.49299.jeremy.kloth@4suite.org>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
	<200707120809.48344.richardjones@optusnet.com.au>
	<20070711223812.D02D13A404D@sparrow.telecommunity.com>
	<200707111720.49299.jeremy.kloth@4suite.org>
Message-ID: <20070712054627.886D13A404D@sparrow.telecommunity.com>

At 05:20 PM 7/11/2007 -0600, Jeremy Kloth wrote:
>On Wednesday 11 July 2007 4:40:26 pm Phillip J. Eby wrote:
> > 1. Access to "hidden" packages' release info
>
>This already exists. Simply call release_data() with the exact 
>version you are
>interested in. It returns the metadata regardless of the "hidden" flag.

There is no way to discover those versions, however, AFAICT


> > 2. Links in the long_description that are rendered by PyPI's web interface
>
>The 'description' key in the dictionary returned by release_data() contains
>the long_description as provided by the package's setup.py. I would think
>that scanning just that should be simpler than relying on particular
>formatting of the PyPI generated package page.

Alas, this entire subject area is one where lots of people "would 
think" that such-and-such a thing would be simpler, but 
isn't.  :(  In this case, long_description is allowed to be 
reStructured Text, which nothing less than a full reST parser can 
handle.  It's much easier to scan for a simple regular expression 
pattern to pull the links out of HTML, than to handle all the ways 
URLs can be spelled in reST, AFAICT.

That having been said, I've never actually made the attempt, for 
simple historical reasons.  I'll happily review patches for the 
functionality, as long as they can gracefully fall back to 
non-XML-RPC use, or provide an option to disable it so people using 
their own static indexes can still function.


From renesd at gmail.com  Thu Jul 12 08:01:00 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Thu, 12 Jul 2007 16:01:00 +1000
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <200707120809.48344.richardjones@optusnet.com.au>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
	<46953A70.6070600@v.loewis.de>
	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>
	<200707120809.48344.richardjones@optusnet.com.au>
Message-ID: <64ddb72c0707112301p51614078sa84a3135584b11e8@mail.gmail.com>

xmlrpc uses POST.  So it's terrible for performance, and semantically
impossible to cache.


On 7/12/07, Richard Jones <richardjones at optusnet.com.au> wrote:
> On Thu, 12 Jul 2007, you wrote:
> > Yup.  Absolutely.  That's why it we should change the index or
> > setuptools, or both.  IMO, it makes the most sense to change the
> > index to have setuptools specific pages, in addition to the ones for
> > humans, that allow:
>
> ... you know about the XML-RPC interface, yes?
>
> http://wiki.python.org/moin/CheeseShopXmlRpc
>
> I never fully understood why setuptools went with HTML scraping instead of
> XML-RPC.
>
>
>      Richard
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>

From renesd at gmail.com  Thu Jul 12 08:15:23 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Thu, 12 Jul 2007 16:15:23 +1000
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <64ddb72c0707112301p51614078sa84a3135584b11e8@mail.gmail.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
	<46953A70.6070600@v.loewis.de>
	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>
	<200707120809.48344.richardjones@optusnet.com.au>
	<64ddb72c0707112301p51614078sa84a3135584b11e8@mail.gmail.com>
Message-ID: <64ddb72c0707112315q6f34439en79b437ad1e9c4d6e@mail.gmail.com>

hellos,

ok, maybe I'm wrong about the performance of this interface!

I guess I meant in general - using POST for GET requests is not such a
nice thing.

cu.

On 7/12/07, Ren? Dudfield <renesd at gmail.com> wrote:
> xmlrpc uses POST.  So it's terrible for performance, and semantically
> impossible to cache.
>
>
> On 7/12/07, Richard Jones <richardjones at optusnet.com.au> wrote:
> > On Thu, 12 Jul 2007, you wrote:
> > > Yup.  Absolutely.  That's why it we should change the index or
> > > setuptools, or both.  IMO, it makes the most sense to change the
> > > index to have setuptools specific pages, in addition to the ones for
> > > humans, that allow:
> >
> > ... you know about the XML-RPC interface, yes?
> >
> > http://wiki.python.org/moin/CheeseShopXmlRpc
> >
> > I never fully understood why setuptools went with HTML scraping instead of
> > XML-RPC.
> >
> >
> >      Richard
> > _______________________________________________
> > Catalog-SIG mailing list
> > Catalog-SIG at python.org
> > http://mail.python.org/mailman/listinfo/catalog-sig
> >
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>

From jim at zope.com  Thu Jul 12 12:34:19 2007
From: jim at zope.com (Jim Fulton)
Date: Thu, 12 Jul 2007 06:34:19 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -    apache config.
In-Reply-To: <4695B816.9020706@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
	<46953A70.6070600@v.loewis.de>
	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>
	<200707120809.48344.richardjones@optusnet.com.au>
	<297846B8-94DC-4770-9476-711796E82FEC@zope.com>
	<ABC0379F-7153-465E-88F3-6ECC919D6D99@zope.com>
	<4695B816.9020706@v.loewis.de>
Message-ID: <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>


On Jul 12, 2007, at 1:11 AM, Martin v. L?wis wrote:

>> ATM, from my machine, xml-rpc requests to PyPI are taking about .27
>> seconds.  This is only a little less than regular page requests.   
>> With
>> the current API, It would require at best 3 requests to get all of  
>> the
>> distribution URLs.  Presumably, with a change to the API, we could  
>> get
>> this down to one request, but that's still a long time given the  
>> demand
>> I expect on PyPI in the future.
>
> You seem to assume that if you see a round trip time of .27 seconds,
> that then PyPI could only do 3 requests per second. That is not so.

Yeah, it occurred to me on my way home that a substantial part of the  
time might be due to distance.

I wonder what times ab against  http://www.python.org/pypi/ZODB3 from  
inside the python.org network would give.
I wonder if it would help much to make multiple HTTP requests in the  
same connection.  This might be something to look at in setuptools  
and/or xmlrpclib.

....

> So it can currently do 130 XML-RPC requests per second, to
> a single client. Inverting it, a request takes 0.0076s,
> which is a lot less than 0.27s.

Cool. That's much better. Thanks for trying this.

OTOH, this points up a couple things:

1. Since many people will be far away from PyPI, I think our long- 
term plan should encompass geographic mirrors.  It's good that the  
server is spending a small amount of time, but it still takes *me* a  
long time to get data.

2. It's important to reduce the number of round trips.

I'm still opposed to using XML-RPC because:

- It's harder to mirror, and

- It's still slower than static pages.

Note that after our discussion, I'm equally against the current  
approach of parsing a human interface.  I still think it makes a lot  
more sense to have a tailored interface for setuptools.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From benji at benjiyork.com  Thu Jul 12 13:26:44 2007
From: benji at benjiyork.com (Benji York)
Date: Thu, 12 Jul 2007 07:26:44 -0400
Subject: [Catalog-sig] No more cc's please (was Re: start on static
 generation, and caching -    apache config.)
In-Reply-To: <4695810C.7070606@pangalactic.us>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>	<721297D4-85EA-4397-84C9-D90E5598477A@zope.com>	<4695241D.3090203@v.loewis.de>	<0AE45281-6CDB-4277-9017-098AC235CCAE@zope.com>	<46953287.8020702@v.loewis.de>	<EBF5DA56-EF5C-47C9-9AEB-8FD071C56404@zope.com>	<46953A70.6070600@v.loewis.de>	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>	<469540FD.5060109@v.loewis.de>	<05004547-983F-4192-8FA6-7D0A05D6155C@zope.com>
	<4695810C.7070606@pangalactic.us>
Message-ID: <46960FF4.3050609@benjiyork.com>

Stephen Waterbury wrote:
> Please exclude me from the cc's of all messages you send to the list!
> I'm a *member* of the catalog-sig list, so I'm getting 2 copies of every
> message in this thread and it's getting annoying.  I'm against all this
> cc crap anyway -- that's why we have a *list*, dammit!  (Geez, one
> would think Python programmers would be more email literate!  grumble.)

Go to http://mail.python.org/mailman/options/catalog-sig and set the 
"Avoid duplicate copies of messages?" option to "Yes".  (One would think 
a list member would be more mailman literate!)
-- 
Benji York
http://benjiyork.com

From amk at amk.ca  Thu Jul 12 14:20:30 2007
From: amk at amk.ca (A.M. Kuchling)
Date: Thu, 12 Jul 2007 08:20:30 -0400
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <46951B55.9050009@v.loewis.de>
References: <A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
	<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>
	<46946A69.4000702@v.loewis.de>
	<bf7b44d50707110728t467356a6t7ee489c2cea1af7d@mail.gmail.com>
	<46951612.9010009@v.loewis.de>
	<bf7b44d50707111047u7e2ae963t9395e769ba827d@mail.gmail.com>
	<46951B55.9050009@v.loewis.de>
Message-ID: <20070712122030.GA5853@amk-desktop.matrixgroup.net>

On Wed, Jul 11, 2007 at 08:03:01PM +0200, "Martin v. L?wis" wrote:
> > IIRC it was a 503 or 502 -- if I had to guess, it appeared that Apache
> > is passing requests through to a local process (mod_rewrite or
> > mod_proxy?), and that process wasn't responding.
> 
> Neither is going on for PyPI, AFAIK - it's mod_fastcgi.

www.python.org/pypi does use mod_proxy to provide PyPI access from the
old URL; it's possible these users were going through www.python.org.

--amk


From gentoodev at gmail.com  Thu Jul 12 16:39:14 2007
From: gentoodev at gmail.com (Rob Cakebread)
Date: Thu, 12 Jul 2007 07:39:14 -0700
Subject: [Catalog-sig] start on static generation,
	and caching - apache config.
In-Reply-To: <9cee7ab80707111042w68b5c8e7sf220dc2cf4011bfd@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>
	<4693FE94.6090107@v.loewis.de> <469446A2.9070500@pangalactic.us>
	<46946A69.4000702@v.loewis.de>
	<bf7b44d50707110728t467356a6t7ee489c2cea1af7d@mail.gmail.com>
	<9cee7ab80707111042w68b5c8e7sf220dc2cf4011bfd@mail.gmail.com>
Message-ID: <9b06ffb10707120739s56ef8736mce1545071df3475b@mail.gmail.com>

On 7/11/07, Fred Drake <fdrake at gmail.com> wrote:
> On 7/11/07, Nathan R. Yergler <nathan at creativecommons.org> wrote:
> > The speed has noticeably improved (thanks!) but as recently as Monday
> > PyPI was unresponsive and then returning proxy errors.  It definitely
> > caused us (Creative Commons) to lose productivity Monday afternoon
> > (PDT).
>
> We're seeing this right now, too.  I'm checking both www.python.org
> and cheeseshop.python.org.
>
>


As of 7:30am PST it's timing out on the website and via XML-RPC,
testing from L.A. or Germany.

From pje at telecommunity.com  Thu Jul 12 20:07:52 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 12 Jul 2007 14:07:52 -0400
Subject: [Catalog-sig] start on static generation,
 and caching -     apache config.
In-Reply-To: <469522D6.1070706@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
	<46931A3A.5000703@v.loewis.de>
	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
	<4693FA2A.3020107@v.loewis.de>
	<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
	<469467AA.7070409@v.loewis.de>
	<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
	<469522D6.1070706@v.loewis.de>
Message-ID: <20070712180539.3BFB43A40D7@sparrow.telecommunity.com>

At 08:35 PM 7/11/2007 +0200, Martin v. L?wis wrote:
> > The questions for us is, how much effort we are willing to make to
> > prevent people from shooting themselves in the foot.  I can understand
> > why Phillip would like the package index to prevent people from choosing
> > problematic package names.
>
>That's not my understanding - the issue isn't with "problematic package
>names", but with conflicting package names. IOW, any single name is
>fine - it's a pair of names that would cause a problem (and only if
>you wanted to install both packages on the same system).

It's also a problem for locating the correct package in the first 
place...  which seems to fall under the jurisdiction of a "package index".  :)

This is just as important for direct human users of the Cheeseshop, 
as it is for the humans using software to access the Cheeseshop.


From jim at zope.com  Thu Jul 12 20:15:03 2007
From: jim at zope.com (Jim Fulton)
Date: Thu, 12 Jul 2007 14:15:03 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -    apache config.
In-Reply-To: <20070712180539.3BFB43A40D7@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
	<46931A3A.5000703@v.loewis.de>
	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
	<4693FA2A.3020107@v.loewis.de>
	<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
	<469467AA.7070409@v.loewis.de>
	<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
	<469522D6.1070706@v.loewis.de>
	<20070712180539.3BFB43A40D7@sparrow.telecommunity.com>
Message-ID: <0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com>


On Jul 12, 2007, at 2:07 PM, Phillip J. Eby wrote:

> At 08:35 PM 7/11/2007 +0200, Martin v. L?wis wrote:
>> > The questions for us is, how much effort we are willing to make to
>> > prevent people from shooting themselves in the foot.  I can  
>> understand
>> > why Phillip would like the package index to prevent people from  
>> choosing
>> > problematic package names.
>>
>> That's not my understanding - the issue isn't with "problematic  
>> package
>> names", but with conflicting package names. IOW, any single name is
>> fine - it's a pair of names that would cause a problem (and only if
>> you wanted to install both packages on the same system).
>
> It's also a problem for locating the correct package in the first  
> place...  which seems to fall under the jurisdiction of a "package  
> index".  :)
>
> This is just as important for direct human users of the Cheeseshop,  
> as it is for the humans using software to access the Cheeseshop.

I want to make sure I understand this.  I would hope that searching  
would be case insensitive and otherwise flexible wrt names.  Is there  
any reason we can't expect URLs and requirement specifications to be  
precisely spelled?  That is, if someone names their package "sPaM", I  
see no reason why PyPI needs to support anything other than http:// 
www.python.org/pypi/sPaM as the one URL of the package.  Someone  
should be able to use the search UI to search for "spam" and see a  
result that includes "sPaM".  From then on, they should be able to  
type the name "sPaM".  Or am I missing something?

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From pje at telecommunity.com  Thu Jul 12 20:43:11 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 12 Jul 2007 14:43:11 -0400
Subject: [Catalog-sig] start on static generation,
 and caching -     apache config.
In-Reply-To: <0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
	<46931A3A.5000703@v.loewis.de>
	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
	<4693FA2A.3020107@v.loewis.de>
	<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
	<469467AA.7070409@v.loewis.de>
	<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
	<469522D6.1070706@v.loewis.de>
	<20070712180539.3BFB43A40D7@sparrow.telecommunity.com>
	<0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com>
Message-ID: <20070712184056.F219A3A40B0@sparrow.telecommunity.com>

At 02:15 PM 7/12/2007 -0400, Jim Fulton wrote:
>I want to make sure I understand this.  I would hope that searching
>would be case insensitive and otherwise flexible wrt names.

PyPI's searching is indeed case insensitive, and is a 
substring/keyword search as well.


>   Is there
>any reason we can't expect URLs and requirement specifications to be
>precisely spelled?  That is, if someone names their package "sPaM", I
>see no reason why PyPI needs to support anything other than http:// 
>www.python.org/pypi/sPaM as the one URL of the package.  Someone
>should be able to use the search UI to search for "spam" and see a
>result that includes "sPaM".  From then on, they should be able to
>type the name "sPaM".  Or am I missing something?

You're missing that the subject is about similarity of names.  A typo 
of say, 'SPam' shouldn't return me some package *other* than the one 
I'm looking for.  It'd be nice if the resulting page said something 
besides "Not Found", too...  like "there's no SPam, but here are a 
bunch of packages whose name contains 'spam'".

If it did that, setuptools would be able to find the right page 
without hitting the main index, too.  But redirection, as proposed by 
Martin, also accomplishes the same thing.

And again, all this helps human direct users of the index, too.


From jim at zope.com  Thu Jul 12 21:02:10 2007
From: jim at zope.com (Jim Fulton)
Date: Thu, 12 Jul 2007 15:02:10 -0400
Subject: [Catalog-sig] Case sensitivity of package names
In-Reply-To: <20070712184056.F219A3A40B0@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
	<46931A3A.5000703@v.loewis.de>
	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
	<4693FA2A.3020107@v.loewis.de>
	<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
	<469467AA.7070409@v.loewis.de>
	<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
	<469522D6.1070706@v.loewis.de>
	<20070712180539.3BFB43A40D7@sparrow.telecommunity.com>
	<0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com>
	<20070712184056.F219A3A40B0@sparrow.telecommunity.com>
Message-ID: <068C982C-701B-45D6-BC34-C48B217B80E8@zope.com>


On Jul 12, 2007, at 2:43 PM, Phillip J. Eby wrote:

> At 02:15 PM 7/12/2007 -0400, Jim Fulton wrote:
>> I want to make sure I understand this.  I would hope that searching
>> would be case insensitive and otherwise flexible wrt names.
>
> PyPI's searching is indeed case insensitive, and is a substring/ 
> keyword search as well.
>
>
>>   Is there
>> any reason we can't expect URLs and requirement specifications to be
>> precisely spelled?  That is, if someone names their package "sPaM", I
>> see no reason why PyPI needs to support anything other than  
>> http:// www.python.org/pypi/sPaM as the one URL of the package.   
>> Someone
>> should be able to use the search UI to search for "spam" and see a
>> result that includes "sPaM".  From then on, they should be able to
>> type the name "sPaM".  Or am I missing something?
>
> You're missing that the subject is about similarity of names.

>   A typo of say, 'SPam' shouldn't return me some package *other*  
> than the one I'm looking for.  I

No, I understand that part.    I understand the desire to avoid  
conflicts that cause problems down the road.  I would prefer to  
"disallow" this by rejecting new package names that are too similar  
to already-registered packages.

> t'd be nice if the resulting page said something besides "Not  
> Found", too...  like "there's no SPam, but here are a bunch of  
> packages whose name contains 'spam'".

I think this would be fine in a human interface.

> If it did that, setuptools would be able to find the right page  
> without hitting the main index, too.  But redirection, as proposed  
> by Martin, also accomplishes the same thing.

I really don't like this for setuptools.  My preference is that  
setuptools should be required to ask for a package with precise  
spelling.

> And again, all this helps human direct users of the index, too.

I think it encourages humans to do bad things.  Is someone misspells  
ZODB3 as zodb3 and is able to install it with easy_install, then  
they'll be tempted to use the name "zodb3" in their requirements  
specifications.  That is a bad thing IMO.  We're talking about  
technical users and I think it is reasonable to expect them to be  
precise in their specifications.

I could live with case-insensitive package names if we (for some  
definition of we, possibly being Guido) decided we want them, but I'd  
prefer they be case sensitive.  I'd still be in favor of avoiding  
confusing duplicates.  If we stick with case-sentitive package names,  
then I'd prefer that the interaction of setuptools with the index be  
case sensitive.

I wouldn't object to  setuptools giving people help. So, for example,  
if I type "zodb3", I wouldn't object to setuptools letting the user  
know that maybe they should use ZODB3.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From pje at telecommunity.com  Thu Jul 12 21:26:02 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 12 Jul 2007 15:26:02 -0400
Subject: [Catalog-sig] Case sensitivity of package names
In-Reply-To: <068C982C-701B-45D6-BC34-C48B217B80E8@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
	<46931A3A.5000703@v.loewis.de>
	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
	<4693FA2A.3020107@v.loewis.de>
	<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
	<469467AA.7070409@v.loewis.de>
	<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
	<469522D6.1070706@v.loewis.de>
	<20070712180539.3BFB43A40D7@sparrow.telecommunity.com>
	<0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com>
	<20070712184056.F219A3A40B0@sparrow.telecommunity.com>
	<068C982C-701B-45D6-BC34-C48B217B80E8@zope.com>
Message-ID: <20070712192350.347B13A40B0@sparrow.telecommunity.com>

At 03:02 PM 7/12/2007 -0400, Jim Fulton wrote:
>We're talking about
>technical users and I think it is reasonable to expect them to be
>precise in their specifications.

IMO, "technical users" is a wider range of people than you seem to be 
thinking of.  In any case, this is a separate topic from disallowing 
too-similar names -- which you agree we should do.

Whether to then also introduce case-sensitivity into various parts of 
easy_install is another subject that doesn't really matter to the catalog-sig.

Please note, however, that it is not a minor change by any means -- 
case-insensitivity exists throughout pkg_resources and setuptools to 
handle operating system filename case-insensitivity, not just for 
index lookups.  In fact, I believe the index lookups *are* 
case-sensitive; IIRC it's only link parsing that is case-insensitive.


From jim at zope.com  Thu Jul 12 21:31:45 2007
From: jim at zope.com (Jim Fulton)
Date: Thu, 12 Jul 2007 15:31:45 -0400
Subject: [Catalog-sig] Case sensitivity of package names
In-Reply-To: <20070712192350.347B13A40B0@sparrow.telecommunity.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<468F3CD4.1070501@v.loewis.de>
	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
	<46931A3A.5000703@v.loewis.de>
	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
	<4693FA2A.3020107@v.loewis.de>
	<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
	<469467AA.7070409@v.loewis.de>
	<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
	<469522D6.1070706@v.loewis.de>
	<20070712180539.3BFB43A40D7@sparrow.telecommunity.com>
	<0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com>
	<20070712184056.F219A3A40B0@sparrow.telecommunity.com>
	<068C982C-701B-45D6-BC34-C48B217B80E8@zope.com>
	<20070712192350.347B13A40B0@sparrow.telecommunity.com>
Message-ID: <4CD1A7D8-1911-45C9-AB08-C4DC3E1CDFA9@zope.com>


On Jul 12, 2007, at 3:26 PM, Phillip J. Eby wrote:
...
> Whether to then also introduce case-sensitivity into various parts  
> of easy_install is another subject that doesn't really matter to  
> the catalog-sig.

I'm not sure we agree on what matters to the catalog sig. :)
(I still need to respond to your note on that topic.)


> Please note, however, that it is not a minor change by any means --  
> case-insensitivity exists throughout pkg_resources and setuptools  
> to handle operating system filename case-insensitivity, not just  
> for index lookups.  In fact, I believe the index lookups *are* case- 
> sensitive; IIRC it's only link parsing that is case-insensitive.

I'm not suggesting that you shouldn't deal with file-system case  
insensitivity.  If I were to change setuptools to match my opinion, I  
would probably just change the code that tries to get a package  
listing to look for close matches to print a suggestion and stop  
rather than guessing a package name and continuing.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From martin at v.loewis.de  Thu Jul 12 23:09:32 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 12 Jul 2007 23:09:32 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -    apache config.
In-Reply-To: <21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>	<46953A70.6070600@v.loewis.de>	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>	<200707120809.48344.richardjones@optusnet.com.au>	<297846B8-94DC-4770-9476-711796E82FEC@zope.com>	<ABC0379F-7153-465E-88F3-6ECC919D6D99@zope.com>	<4695B816.9020706@v.loewis.de>
	<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
Message-ID: <4696988C.6050309@v.loewis.de>

> I wonder what times ab against  http://www.python.org/pypi/ZODB3 from  
> inside the python.org network would give.

I just measured it. 1000 requests take 17s using urllib, giving 60
request per second.

> I wonder if it would help much to make multiple HTTP requests in the  
> same connection.  This might be something to look at in setuptools  
> and/or xmlrpclib.

Only for remote connections, due to the round-trips required for
TCP handshake. Locally, Apache opens a new connection to the FCGI
servers per requests (using the farmer-worker pattern).

> 1. Since many people will be far away from PyPI, I think our long- 
> term plan should encompass geographic mirrors.  It's good that the  
> server is spending a small amount of time, but it still takes *me* a  
> long time to get data.

Ok. I am, in general, skeptical about mirroring. However, if it
makes people happy, feel free to implement it.

A number of issues should be considered, of course:
- there should be a way to get authoritative answers somehow, preferably
  from mirrors, but, if necessary, from the main site
- I really wish to collect download counters across mirrors. "Official"
  mirrors should be obliged to report download statistics once a day
  or so.

> 2. It's important to reduce the number of round trips.

A colleague today suggested that the best way to reduce round trips
is to give each machine a local copy of the index, the same way
Debian apt works: you do 'apt-get update', and then have a local
copy of the catalog that you can build against. No roundtrips
at all (except for the one to update the local catalog), for the
expense of being out of date if you don't manually update the
catalog.

Regards,
Martin


From martin at v.loewis.de  Thu Jul 12 23:12:55 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 12 Jul 2007 23:12:55 +0200
Subject: [Catalog-sig] www.python.org/pypi might redirect?
In-Reply-To: <20070712122030.GA5853@amk-desktop.matrixgroup.net>
References: <A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>	<4693FE94.6090107@v.loewis.de>
	<469446A2.9070500@pangalactic.us>	<46946A69.4000702@v.loewis.de>	<bf7b44d50707110728t467356a6t7ee489c2cea1af7d@mail.gmail.com>	<46951612.9010009@v.loewis.de>	<bf7b44d50707111047u7e2ae963t9395e769ba827d@mail.gmail.com>	<46951B55.9050009@v.loewis.de>
	<20070712122030.GA5853@amk-desktop.matrixgroup.net>
Message-ID: <46969957.1020404@v.loewis.de>

> www.python.org/pypi does use mod_proxy to provide PyPI access from the
> old URL; it's possible these users were going through www.python.org.

I wonder why that is. Would there be anything wrong with making that
a (permanent) redirect instead?

Users of the old URL should see a speedup if they do many requests;
all relative URLs would directly go to cheeseshop, rather than having
to pass through www.python.org again.

Regards,
Martin

From martin at v.loewis.de  Thu Jul 12 23:25:58 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 12 Jul 2007 23:25:58 +0200
Subject: [Catalog-sig] start on static generation,
 and caching - apache config.
In-Reply-To: <9b06ffb10707120739s56ef8736mce1545071df3475b@mail.gmail.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<6DF003CA-0930-4255-A5CD-469689D9D2E2@zope.com>	<4693FE94.6090107@v.loewis.de>
	<469446A2.9070500@pangalactic.us>	<46946A69.4000702@v.loewis.de>	<bf7b44d50707110728t467356a6t7ee489c2cea1af7d@mail.gmail.com>	<9cee7ab80707111042w68b5c8e7sf220dc2cf4011bfd@mail.gmail.com>
	<9b06ffb10707120739s56ef8736mce1545071df3475b@mail.gmail.com>
Message-ID: <46969C66.2020806@v.loewis.de>

> As of 7:30am PST it's timing out on the website and via XML-RPC,
> testing from L.A. or Germany.

It seems the same crash of all FCGI servers (with a failure of mod_fcgi
to restart them) has happened again. I still have no clue what's causing
it, but I added a watchdog that should restart it within a minute the
next time.

Regards,
Martin

From martin at v.loewis.de  Thu Jul 12 23:38:50 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 12 Jul 2007 23:38:50 +0200
Subject: [Catalog-sig] Case sensitivity of package names
In-Reply-To: <068C982C-701B-45D6-BC34-C48B217B80E8@zope.com>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>	<46931A3A.5000703@v.loewis.de>	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>	<4693FA2A.3020107@v.loewis.de>	<20070710221547.4A3043A40A4@sparrow.telecommunity.com>	<469467AA.7070409@v.loewis.de>	<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>	<469522D6.1070706@v.loewis.de>	<20070712180539.3BFB43A40D7@sparrow.telecommunity.com>	<0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com>	<20070712184056.F219A3A40B0@sparrow.telecommunity.com>
	<068C982C-701B-45D6-BC34-C48B217B80E8@zope.com>
Message-ID: <46969F6A.8030904@v.loewis.de>

> I really don't like this for setuptools.  My preference is that  
> setuptools should be required to ask for a package with precise  
> spelling.

I think the way setuptools currently works is this:

Every name gets converted to its lower-case safe-name equivalent.
All dependencies, file names, resource identifications etc
are based on that version of the name, *not* the "true"
name of the package.

Then, when setuptools tries to find a package whose "true"
name is in mixed-case, it uses the lower-cased safe-named
version, and PyPI reports that the package does not exist.
Then, setuptools queries the entire package list, trying
to find out the original spelling of the package.

I'm sure Phillip will correct me if I'm wrong.

> I could live with case-insensitive package names if we (for some  
> definition of we, possibly being Guido) decided we want them, but I'd  
> prefer they be case sensitive.  I'd still be in favor of avoiding  
> confusing duplicates.  If we stick with case-sentitive package names,  
> then I'd prefer that the interaction of setuptools with the index be  
> case sensitive.

See above - I believe setuptools package names are case insensitive
today.

Regards,
Martin

From jim at zope.com  Fri Jul 13 01:14:33 2007
From: jim at zope.com (Jim Fulton)
Date: Thu, 12 Jul 2007 19:14:33 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -    apache config.
In-Reply-To: <4696988C.6050309@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>	<46953A70.6070600@v.loewis.de>	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>	<200707120809.48344.richardjones@optusnet.com.au>	<297846B8-94DC-4770-9476-711796E82FEC@zope.com>	<ABC0379F-7153-465E-88F3-6ECC919D6D99@zope.com>	<4695B816.9020706@v.loewis.de>
	<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
	<4696988C.6050309@v.loewis.de>
Message-ID: <C4474D36-81FB-4B15-98B2-1C68AD495462@zope.com>


On Jul 12, 2007, at 5:09 PM, Martin v. L?wis wrote:
...
>> I wonder if it would help much to make multiple HTTP requests in the
>> same connection.  This might be something to look at in setuptools
>> and/or xmlrpclib.
>
> Only for remote connections, due to the round-trips required for
> TCP handshake. Locally, Apache opens a new connection to the FCGI
> servers per requests (using the farmer-worker pattern).

Right, but most connections will be remote, so this is a potential win.

>
>> 1. Since many people will be far away from PyPI, I think our long-
>> term plan should encompass geographic mirrors.  It's good that the
>> server is spending a small amount of time, but it still takes *me* a
>> long time to get data.
>
> Ok. I am, in general, skeptical about mirroring. However, if it
> makes people happy, feel free to implement it.

My goal is to have PyPI provide a simplified version of the data for  
use by setuptools that is easily mirrored using standard mirroring  
tools.  (I may actually prototype this with a kind of mirror.)

> A number of issues should be considered, of course:
> - there should be a way to get authoritative answers somehow,  
> preferably
>   from mirrors, but, if necessary, from the main site

I don't know what you mean.  I envision mirrors as being read-only  
and only used by setuptools. The main site would certainly be  
authoritative.

> - I really wish to collect download counters across mirrors.  
> "Official"
>   mirrors should be obliged to report download statistics once a day
>   or so.

OK.

>
>> 2. It's important to reduce the number of round trips.
>
> A colleague today suggested that the best way to reduce round trips
> is to give each machine a local copy of the index, the same way
> Debian apt works: you do 'apt-get update', and then have a local
> copy of the catalog that you can build against. No roundtrips
> at all (except for the one to update the local catalog), for the
> expense of being out of date if you don't manually update the
> catalog.

Yup. This might be a really nice way to go. It would be especially  
nice if a client could contact PyPI and ask for new data since a  
given time.  I imagine that this request could be as cheap as the  
requests we have now, unless a client was very out of date.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From pje at telecommunity.com  Fri Jul 13 01:35:05 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 12 Jul 2007 19:35:05 -0400
Subject: [Catalog-sig] Case sensitivity of package names
In-Reply-To: <46969F6A.8030904@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>
	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>
	<468FC2BB.7030607@v.loewis.de>
	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>
	<468FF69B.2090503@v.loewis.de>
	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>
	<46910BBF.3010308@v.loewis.de>
	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>
	<4692B3A3.5030209@v.loewis.de>
	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>
	<46931A3A.5000703@v.loewis.de>
	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>
	<4693FA2A.3020107@v.loewis.de>
	<20070710221547.4A3043A40A4@sparrow.telecommunity.com>
	<469467AA.7070409@v.loewis.de>
	<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
	<469522D6.1070706@v.loewis.de>
	<20070712180539.3BFB43A40D7@sparrow.telecommunity.com>
	<0F4183B4-D60B-4715-A75C-531332C0CE2B@zope.com>
	<20070712184056.F219A3A40B0@sparrow.telecommunity.com>
	<068C982C-701B-45D6-BC34-C48B217B80E8@zope.com>
	<46969F6A.8030904@v.loewis.de>
Message-ID: <20070712233252.3C2913A40A9@sparrow.telecommunity.com>

At 11:38 PM 7/12/2007 +0200, Martin v. L?wis wrote:
> > I really don't like this for setuptools.  My preference is that
> > setuptools should be required to ask for a package with precise
> > spelling.
>
>I think the way setuptools currently works is this:
>
>Every name gets converted to its lower-case safe-name equivalent.
>All dependencies, file names, resource identifications etc
>are based on that version of the name, *not* the "true"
>name of the package.

Object comparisons are done case-insensitively, but the objects 
themselves keep the case-insensitive forms ('key' attributes) 
separate from the originally-input names ('project_name' attributes).


>Then, when setuptools tries to find a package whose "true"
>name is in mixed-case, it uses the lower-cased safe-named
>version, and PyPI reports that the package does not exist.
>Then, setuptools queries the entire package list, trying
>to find out the original spelling of the package.

This is almost correct, except that it actually tries to lookup 
whatever the user actually input, then the safe_name() form of 
that.  For index lookups, it does not actually change the case of 
what was entered, so if the user enters something that exactly 
matches what's on PyPI, they'll have a better chance of getting 
everything in one request....  unless there are multiple versions 
listed, of course.


From pje at telecommunity.com  Fri Jul 13 01:43:04 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 12 Jul 2007 19:43:04 -0400
Subject: [Catalog-sig] start on static generation,
 and caching -     apache config.
In-Reply-To: <C4474D36-81FB-4B15-98B2-1C68AD495462@zope.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>
	<46953A70.6070600@v.loewis.de>
	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>
	<200707120809.48344.richardjones@optusnet.com.au>
	<297846B8-94DC-4770-9476-711796E82FEC@zope.com>
	<ABC0379F-7153-465E-88F3-6ECC919D6D99@zope.com>
	<4695B816.9020706@v.loewis.de>
	<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
	<4696988C.6050309@v.loewis.de>
	<C4474D36-81FB-4B15-98B2-1C68AD495462@zope.com>
Message-ID: <20070712234049.97ED63A40A9@sparrow.telecommunity.com>

At 07:14 PM 7/12/2007 -0400, Jim Fulton wrote:

>On Jul 12, 2007, at 5:09 PM, Martin v. L?wis wrote:
> >> 2. It's important to reduce the number of round trips.
> >
> > A colleague today suggested that the best way to reduce round trips
> > is to give each machine a local copy of the index, the same way
> > Debian apt works: you do 'apt-get update', and then have a local
> > copy of the catalog that you can build against. No roundtrips
> > at all (except for the one to update the local catalog), for the
> > expense of being out of date if you don't manually update the
> > catalog.
>
>Yup. This might be a really nice way to go. It would be especially
>nice if a client could contact PyPI and ask for new data since a
>given time.  I imagine that this request could be as cheap as the
>requests we have now, unless a client was very out of date.

Such a query could simply consist of which packages had been updated, 
and the data could then be cleared from the local cache.

The downside to this approach is that it's not any faster for 
anything you've never downloaded before.

So, I'm not really sure how to create a quality user experience with 
edge caching alone.  It seems to me that geographically localized 
mirrors are needed to provide infrequent users and new users with 
good performance.  And presumably, the commercial users who are 
having issues now, want their users as well as their developers to 
have good performance.

(Personally, I find it extremely irritating every time the "yum" 
package manager makes me wait for it to download a bunch of 
repository data that isn't necessarily even related to what I just 
asked it to do.)


From doug at hellfly.net  Fri Jul 13 03:26:12 2007
From: doug at hellfly.net (Doug Hellmann)
Date: Thu, 12 Jul 2007 21:26:12 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -    apache config.
In-Reply-To: <C4474D36-81FB-4B15-98B2-1C68AD495462@zope.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>	<46953A70.6070600@v.loewis.de>	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>	<200707120809.48344.richardjones@optusnet.com.au>	<297846B8-94DC-4770-9476-711796E82FEC@zope.com>	<ABC0379F-7153-465E-88F3-6ECC919D6D99@zope.com>	<4695B816.9020706@v.loewis.de>
	<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
	<4696988C.6050309@v.loewis.de>
	<C4474D36-81FB-4B15-98B2-1C68AD495462@zope.com>
Message-ID: <786451BD-A013-48C1-87B9-884F46151B81@hellfly.net>


On Jul 12, 2007, at 7:14 PM, Jim Fulton wrote:

>>> 2. It's important to reduce the number of round trips.
>>
>> A colleague today suggested that the best way to reduce round trips
>> is to give each machine a local copy of the index, the same way
>> Debian apt works: you do 'apt-get update', and then have a local
>> copy of the catalog that you can build against. No roundtrips
>> at all (except for the one to update the local catalog), for the
>> expense of being out of date if you don't manually update the
>> catalog.
>
> Yup. This might be a really nice way to go. It would be especially
> nice if a client could contact PyPI and ask for new data since a
> given time.  I imagine that this request could be as cheap as the
> requests we have now, unless a client was very out of date.

That sounds like RSS.

Doug


From martin at v.loewis.de  Fri Jul 13 10:04:33 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 13 Jul 2007 10:04:33 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -    apache config.
In-Reply-To: <C4474D36-81FB-4B15-98B2-1C68AD495462@zope.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>	<46953A70.6070600@v.loewis.de>	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>	<200707120809.48344.richardjones@optusnet.com.au>	<297846B8-94DC-4770-9476-711796E82FEC@zope.com>	<ABC0379F-7153-465E-88F3-6ECC919D6D99@zope.com>	<4695B816.9020706@v.loewis.de>
	<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
	<4696988C.6050309@v.loewis.de>
	<C4474D36-81FB-4B15-98B2-1C68AD495462@zope.com>
Message-ID: <46973211.1060801@v.loewis.de>

>> A number of issues should be considered, of course:
>> - there should be a way to get authoritative answers somehow, preferably
>>   from mirrors, but, if necessary, from the main site
> 
> I don't know what you mean.  I envision mirrors as being read-only and
> only used by setuptools. The main site would certainly be authoritative.

The problem is with outdated information. With a mirror, the question
is always "is my information current". Perhaps it's ok for users of
a mirror to use outdated information. However, when people register
a package, then use setuptools to install it, they might be puzzled
that it won't find the package just because it was using an outdated
mirror.

In many cases, it's fine to use outdated information, of course, e.g.
if you know that the package hasn't been released for many weeks now,
or in case you will update the next day again, and then fetch the
newer release.

> Yup. This might be a really nice way to go. It would be especially nice
> if a client could contact PyPI and ask for new data since a given time. 
> I imagine that this request could be as cheap as the requests we have
> now, unless a client was very out of date.

PyPI already supports that: the updated_releases RPC call will return
all packages that have changed since a given date.

Regards,
Martin

From jim at zope.com  Fri Jul 13 16:59:01 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 13 Jul 2007 10:59:01 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -    apache config.
In-Reply-To: <46973211.1060801@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>	<46953A70.6070600@v.loewis.de>	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>	<200707120809.48344.richardjones@optusnet.com.au>	<297846B8-94DC-4770-9476-711796E82FEC@zope.com>	<ABC0379F-7153-465E-88F3-6ECC919D6D99@zope.com>	<4695B816.9020706@v.loewis.de>
	<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
	<4696988C.6050309@v.loewis.de>
	<C4474D36-81FB-4B15-98B2-1C68AD495462@zope.com>
	<46973211.1060801@v.loewis.de>
Message-ID: <D878FF10-D490-48B8-AE2D-EF4B242823A6@zope.com>


On Jul 13, 2007, at 4:04 AM, Martin v. L?wis wrote:

>>> A number of issues should be considered, of course:
>>> - there should be a way to get authoritative answers somehow,  
>>> preferably
>>>   from mirrors, but, if necessary, from the main site
>>
>> I don't know what you mean.  I envision mirrors as being read-only  
>> and
>> only used by setuptools. The main site would certainly be  
>> authoritative.
>
> The problem is with outdated information. With a mirror, the question
> is always "is my information current". Perhaps it's ok for users of
> a mirror to use outdated information. However, when people register
> a package, then use setuptools to install it, they might be puzzled
> that it won't find the package just because it was using an outdated
> mirror.

I agree 100% with this concern, which is why I was skeptical of  
caching in the classical form.

Right. So the question is, how can we keep the mirror up to date? :)


>> Yup. This might be a really nice way to go. It would be especially  
>> nice
>> if a client could contact PyPI and ask for new data since a given  
>> time.
>> I imagine that this request could be as cheap as the requests we have
>> now, unless a client was very out of date.
>
> PyPI already supports that: the updated_releases RPC call will return
> all packages that have changed since a given date.

Awesome!  Too bad it wasn't shown in:
http://wiki.python.org/moin/CheeseShopXmlRpc

I'll look at the source (location hints welcome) and update that page.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From martin at v.loewis.de  Fri Jul 13 17:14:38 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 13 Jul 2007 17:14:38 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -    apache config.
In-Reply-To: <D878FF10-D490-48B8-AE2D-EF4B242823A6@zope.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>	<46953A70.6070600@v.loewis.de>	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>	<200707120809.48344.richardjones@optusnet.com.au>	<297846B8-94DC-4770-9476-711796E82FEC@zope.com>	<ABC0379F-7153-465E-88F3-6ECC919D6D99@zope.com>	<4695B816.9020706@v.loewis.de>
	<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
	<4696988C.6050309@v.loewis.de>
	<C4474D36-81FB-4B15-98B2-1C68AD495462@zope.com>
	<46973211.1060801@v.loewis.de>
	<D878FF10-D490-48B8-AE2D-EF4B242823A6@zope.com>
Message-ID: <469796DE.805@v.loewis.de>

> Right. So the question is, how can we keep the mirror up to date? :)

I think there is no efficient way to provide perfect synchronization
(not without putting too much load on the central server again).

If slight propagationdelays are acceptable, it would be possible that
the central server publishes sequence numbers of each update performed,
and mirrors could check with a single roundtrip what the most current
sequence number is.

Then it is the mirror's choice how much it can age; checking every
minute would be reasonable IMO for most purposes; users that want
to see their just-uploaded stuff then would either need to wait
that minute, or go to the master site, or fetch the sequence
number of the master site and compare it with the one of the mirror
they use.

> I'll look at the source (location hints welcome) and update that page.

See

http://svn.python.org/view/trunk/pypi/rpc.py?rev=433&root=packages&view=markup

Regards,
Martin

From jim at zope.com  Fri Jul 13 18:01:18 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 13 Jul 2007 12:01:18 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -    apache config.
In-Reply-To: <46973211.1060801@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>	<46953A70.6070600@v.loewis.de>	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>	<200707120809.48344.richardjones@optusnet.com.au>	<297846B8-94DC-4770-9476-711796E82FEC@zope.com>	<ABC0379F-7153-465E-88F3-6ECC919D6D99@zope.com>	<4695B816.9020706@v.loewis.de>
	<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
	<4696988C.6050309@v.loewis.de>
	<C4474D36-81FB-4B15-98B2-1C68AD495462@zope.com>
	<46973211.1060801@v.loewis.de>
Message-ID: <2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com>


On Jul 13, 2007, at 4:04 AM, Martin v. L?wis wrote:
> PyPI already supports that: the updated_releases RPC call will return
> all packages that have changed since a given date.

It appears that this only shows new releases.  If I update a new  
distribution to a release, it doesn't cause the release to appear as  
updated. A common scenario for me is that I'll create a release,  
update a source release, and then, some time later, when someone bugs  
me, I'll upload a windows egg.  The way things are now, the later  
upload won't be noticed. Of course, the initial upload won't  be  
noticed if someone happens to poll between release creation and the  
first upload.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Fri Jul 13 18:02:33 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 13 Jul 2007 12:02:33 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -    apache config.
In-Reply-To: <469796DE.805@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>	<46953A70.6070600@v.loewis.de>	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>	<200707120809.48344.richardjones@optusnet.com.au>	<297846B8-94DC-4770-9476-711796E82FEC@zope.com>	<ABC0379F-7153-465E-88F3-6ECC919D6D99@zope.com>	<4695B816.9020706@v.loewis.de>
	<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
	<4696988C.6050309@v.loewis.de>
	<C4474D36-81FB-4B15-98B2-1C68AD495462@zope.com>
	<46973211.1060801@v.loewis.de>
	<D878FF10-D490-48B8-AE2D-EF4B242823A6@zope.com>
	<469796DE.805@v.loewis.de>
Message-ID: <E8DF0B80-4FEA-4AA4-87BF-F2BEFABAF004@zope.com>


On Jul 13, 2007, at 11:14 AM, Martin v. L?wis wrote:

>> Right. So the question is, how can we keep the mirror up to date? :)
>
> I think there is no efficient way to provide perfect synchronization
> (not without putting too much load on the central server again).

Well, if there mirrors were known, then the primary could notify  
them.  Of course, that would make them more complex.  Of course  
polling has its complexities too.


> If slight propagationdelays are acceptable, it would be possible that
> the central server publishes sequence numbers of each update  
> performed,
> and mirrors could check with a single roundtrip what the most current
> sequence number is.

If the updated_releases actually reflected updates, then I think that  
would be good enough. Then we could use the UTC second as  the  
sequence number. :)

>
> Then it is the mirror's choice how much it can age; checking every
> minute would be reasonable IMO for most purposes;

Yup

> users that want
> to see their just-uploaded stuff then would either need to wait
> that minute, or go to the master site, or fetch the sequence
> number of the master site and compare it with the one of the mirror
> they use.

Yup

>
>> I'll look at the source (location hints welcome) and update that  
>> page.
>
> See
>
> http://svn.python.org/view/trunk/pypi/rpc.py? 
> rev=433&root=packages&view=markup

Thanks.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From martin at v.loewis.de  Fri Jul 13 18:50:50 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 13 Jul 2007 18:50:50 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -    apache config.
In-Reply-To: <2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>	<46953A70.6070600@v.loewis.de>	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>	<200707120809.48344.richardjones@optusnet.com.au>	<297846B8-94DC-4770-9476-711796E82FEC@zope.com>	<ABC0379F-7153-465E-88F3-6ECC919D6D99@zope.com>	<4695B816.9020706@v.loewis.de>
	<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
	<4696988C.6050309@v.loewis.de>
	<C4474D36-81FB-4B15-98B2-1C68AD495462@zope.com>
	<46973211.1060801@v.loewis.de>
	<2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com>
Message-ID: <4697AD6A.1030602@v.loewis.de>

> It appears that this only shows new releases.

That's true. I don't know why it does that; it may be that this
interface predates file uploading.

> If I update a new distribution to a release

With "distribution", you always mean "file", right?

Regards,
Martin

From martin at v.loewis.de  Fri Jul 13 19:07:30 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 13 Jul 2007 19:07:30 +0200
Subject: [Catalog-sig] Effect of HTTP 1.1
Message-ID: <4697B152.7030304@v.loewis.de>

I did some measurements, with the script below.
For 30 requests, a single HTTP 1.1 connection
needs 5.4s over my DSL connection; 30 individual
connections need 11.7s. So if setuptools expects
to request multiple pages from the index, it would
definitely be useful to keep the connection
(I don't know at all whether it currently does so
already).

Regards,
Martin

import httplib, time

t=time.time()
h = httplib.HTTPConnection("cheeseshop.python.org")
for i in range(30):
    h.putrequest("GET", "/pypi/Lamina/")
    h.endheaders()
    r = h.getresponse()
    r.begin()
    r.read()
h.close()
print time.time()-t

t=time.time()
for i in range(30):
    h = httplib.HTTPConnection("cheeseshop.python.org")
    h.putrequest("GET", "/pypi/Lamina/")
    h.endheaders()
    r = h.getresponse()
    r.begin()
    r.read()
    h.close()
print time.time()-t

From jim at zope.com  Fri Jul 13 19:45:10 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 13 Jul 2007 13:45:10 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -    apache config.
In-Reply-To: <4697AD6A.1030602@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>	<46953A70.6070600@v.loewis.de>	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>	<200707120809.48344.richardjones@optusnet.com.au>	<297846B8-94DC-4770-9476-711796E82FEC@zope.com>	<ABC0379F-7153-465E-88F3-6ECC919D6D99@zope.com>	<4695B816.9020706@v.loewis.de>
	<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>
	<4696988C.6050309@v.loewis.de>
	<C4474D36-81FB-4B15-98B2-1C68AD495462@zope.com>
	<46973211.1060801@v.loewis.de>
	<2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com>
	<4697AD6A.1030602@v.loewis.de>
Message-ID: <C375E804-A4DE-49C1-A5DE-B432130EC94D@zope.com>


On Jul 13, 2007, at 12:50 PM, Martin v. L?wis wrote:

>> It appears that this only shows new releases.
>
> That's true. I don't know why it does that; it may be that this
> interface predates file uploading.
>
>> If I update a new distribution to a release
>
> With "distribution", you always mean "file", right?

Yup.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Fri Jul 13 19:54:57 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 13 Jul 2007 13:54:57 -0400
Subject: [Catalog-sig] Effect of HTTP 1.1
In-Reply-To: <4697B152.7030304@v.loewis.de>
References: <4697B152.7030304@v.loewis.de>
Message-ID: <657E3A38-2871-4B4F-9CBE-B5A777CFB9F5@zope.com>


On Jul 13, 2007, at 1:07 PM, Martin v. L?wis wrote:

> I did some measurements, with the script below.
> For 30 requests, a single HTTP 1.1 connection
> needs 5.4s over my DSL connection; 30 individual
> connections need 11.7s.

Interesting.  Your DSL times for connection/request are actually  
longer than what I'm seeing. Maybe geography isn't so important.  
Measurements are good.  It's going to be interesting to see how this  
all pans out.   It''s definitely interesting that you doubled the  
throughput using a single connection.

> So if setuptools expects
> to request multiple pages from the index, it would
> definitely be useful to keep the connection
> (I don't know at all whether it currently does so
> already).

I don't think so.

This also looks like a good optimization for xmlrpclib.

Thanks for trying this.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From pje at telecommunity.com  Fri Jul 13 20:34:45 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 13 Jul 2007 14:34:45 -0400
Subject: [Catalog-sig] Effect of HTTP 1.1
In-Reply-To: <4697B152.7030304@v.loewis.de>
References: <4697B152.7030304@v.loewis.de>
Message-ID: <20070713183235.817C13A40A8@sparrow.telecommunity.com>

At 07:07 PM 7/13/2007 +0200, Martin v. L?wis wrote:
>I did some measurements, with the script below.
>For 30 requests, a single HTTP 1.1 connection
>needs 5.4s over my DSL connection; 30 individual
>connections need 11.7s. So if setuptools expects
>to request multiple pages from the index, it would
>definitely be useful to keep the connection
>(I don't know at all whether it currently does so
>already).

It doesn't.  I looked just now and found this, that looks like it 
might produce the desired effect for easy_install:

http://linux.duke.edu/projects/urlgrabber/contents/urlgrabber/keepalive.py

Perhaps someone (Jim?) would like to try activating it in a process 
using easy_install (i.e. doing the urllib2.install_opener dance), and 
see if it gives a performance boost.  If it works well, then perhaps 
a patch for setuptools.package_index to use a custom opener is in order.


From jim at zope.com  Fri Jul 13 20:51:54 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 13 Jul 2007 14:51:54 -0400
Subject: [Catalog-sig] Effect of HTTP 1.1
In-Reply-To: <20070713183235.817C13A40A8@sparrow.telecommunity.com>
References: <4697B152.7030304@v.loewis.de>
	<20070713183235.817C13A40A8@sparrow.telecommunity.com>
Message-ID: <5F747173-F02A-42A5-8767-ACDA61CD0C5C@zope.com>


On Jul 13, 2007, at 2:34 PM, Phillip J. Eby wrote:

> At 07:07 PM 7/13/2007 +0200, Martin v. L?wis wrote:
>> I did some measurements, with the script below.
>> For 30 requests, a single HTTP 1.1 connection
>> needs 5.4s over my DSL connection; 30 individual
>> connections need 11.7s. So if setuptools expects
>> to request multiple pages from the index, it would
>> definitely be useful to keep the connection
>> (I don't know at all whether it currently does so
>> already).
>
> It doesn't.  I looked just now and found this, that looks like it  
> might produce the desired effect for easy_install:
>
> http://linux.duke.edu/projects/urlgrabber/contents/urlgrabber/ 
> keepalive.py
>
> Perhaps someone (Jim?) would like to try activating it in a process  
> using easy_install (i.e. doing the urllib2.install_opener dance),  
> and see if it gives a performance boost.  If it works well, then  
> perhaps a patch for setuptools.package_index to use a custom opener  
> is in order.

I'd be happy to do this sometime in the next few weeks.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Fri Jul 13 22:17:20 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 13 Jul 2007 16:17:20 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -    apache config.
In-Reply-To: <4697D796.5080803@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>	<46953A70.6070600@v.loewis.de>	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>	<200707120809.48344.richardjones@optusnet.com.au>	<297846B8-94DC-4770-9476-711796E82FEC@zope.com>	<ABC0379F-7153-465E-88F3-6ECC919D6D99@zope.com>	<4695B816.9020706@v.loewis.de>	<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>	<4696988C.6050309@v.loewis.de>	<C4474D36-81FB-4B15-98B2-1C68AD495462@zope.com>	<46973211.1060801@v.loewis.de>
	<2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com>
	<4697D796.5080803@v.loewis.de>
Message-ID: <FCA5A24F-7240-44DB-8BD3-2EB5CD779441@zope.com>


On Jul 13, 2007, at 3:50 PM, Martin v. L?wis wrote:

>> It appears that this only shows new releases.  If I update a new
>> distribution to a release, it doesn't cause the release to appear as
>> updated. A common scenario for me is that I'll create a release,
>> update a source release, and then, some time later, when someone bugs
>> me, I'll upload a windows egg.  The way things are now, the later
>> upload won't be noticed. Of course, the initial upload won't  be
>> noticed if someone happens to poll between release creation and the
>> first upload.
>
> Ok, I added another operation "changelog", that gives you four-tuples
> name, version, timestamp, action. It's the complete journal, except
> that privacy fields (author and IP) are not returned, and except
> changes to the package (rather than a specific release) are not
> returned.

Very cool.  Thanks!  It doesn't seem to catch file-uploads, either  
through distutils or through the web. I uploaded a windows release  
for zope.proxy this morning and I just (withing the last half hour)  
uploaded some eggs for http://cheeseshop.python.org/pypi/ 
zc.zodbrecipes/0.2.1 and am not seeing anything in the transcript.

> The possible values for "action" remain undocumented. If there is
> interested, people can propose a specification that PyPI should
> try to stick to; this specification should allow for
> still-undocumented action values (to allow addition of more actions).

I have no immediate use for action at this time other than as  
documentation when interpreting the output.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From martin at v.loewis.de  Fri Jul 13 22:43:07 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 13 Jul 2007 22:43:07 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -    apache config.
In-Reply-To: <FCA5A24F-7240-44DB-8BD3-2EB5CD779441@zope.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>	<46953A70.6070600@v.loewis.de>	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>	<200707120809.48344.richardjones@optusnet.com.au>	<297846B8-94DC-4770-9476-711796E82FEC@zope.com>	<ABC0379F-7153-465E-88F3-6ECC919D6D99@zope.com>	<4695B816.9020706@v.loewis.de>	<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>	<4696988C.6050309@v.loewis.de>	<C4474D36-81FB-4B15-98B2-1C68AD495462@zope.com>	<46973211.1060801@v.loewis.de>
	<2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com>
	<4697D796.5080803@v.loewis.de>
	<FCA5A24F-7240-44DB-8BD3-2EB5CD779441@zope.com>
Message-ID: <4697E3DB.8070801@v.loewis.de>

> Very cool.  Thanks!  It doesn't seem to catch file-uploads, either
> through distutils or through the web. I uploaded a windows release for
> zope.proxy this morning and I just (withing the last half hour) uploaded
> some eggs for http://cheeseshop.python.org/pypi/zc.zodbrecipes/0.2.1 and
> am not seeing anything in the transcript.

It appears that file additions were logged without a package version
(just package name). I don't know why this is, but I changed changelog
to return all entries (so version may be None, using the XML-RPC nil
extension). I also started logging the version for the file.

So please try again.

Regards,
Martin

From jim at zope.com  Fri Jul 13 23:15:26 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 13 Jul 2007 17:15:26 -0400
Subject: [Catalog-sig] start on static generation,
	and caching -    apache config.
In-Reply-To: <4697E3DB.8070801@v.loewis.de>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>	<46953A70.6070600@v.loewis.de>	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>	<200707120809.48344.richardjones@optusnet.com.au>	<297846B8-94DC-4770-9476-711796E82FEC@zope.com>	<ABC0379F-7153-465E-88F3-6ECC919D6D99@zope.com>	<4695B816.9020706@v.loewis.de>	<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>	<4696988C.6050309@v.loewis.de>	<C4474D36-81FB-4B15-98B2-1C68AD495462@zope.com>	<46973211.1060801@v.loewis.de>
	<2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com>
	<4697D796.5080803@v.loewis.de>
	<FCA5A24F-7240-44DB-8BD3-2EB5CD779441@zope.com>
	<4697E3DB.8070801@v.loewis.de>
Message-ID: <8175AD9F-7D42-4C8C-8F97-2CAAA876F7D9@zope.com>


On Jul 13, 2007, at 4:43 PM, Martin v. L?wis wrote:

>> Very cool.  Thanks!  It doesn't seem to catch file-uploads, either
>> through distutils or through the web. I uploaded a windows release  
>> for
>> zope.proxy this morning and I just (withing the last half hour)  
>> uploaded
>> some eggs for http://cheeseshop.python.org/pypi/zc.zodbrecipes/ 
>> 0.2.1 and
>> am not seeing anything in the transcript.
>
> It appears that file additions were logged without a package version
> (just package name). I don't know why this is, but I changed changelog
> to return all entries (so version may be None, using the XML-RPC nil
> extension). I also started logging the version for the file.
>
> So please try again.

Works great! Thanks!

(Now I just wish I wasn't going to be offline all weekend.)

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From martin at v.loewis.de  Fri Jul 13 21:50:46 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 13 Jul 2007 21:50:46 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -    apache config.
In-Reply-To: <2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com>
References: <200707110404.l6B44OXk032154@mail16.syd.optusnet.com.au>	<46953A70.6070600@v.loewis.de>	<ECFE2268-F2FA-44D4-95D4-23B1F147A2DC@zope.com>	<200707120809.48344.richardjones@optusnet.com.au>	<297846B8-94DC-4770-9476-711796E82FEC@zope.com>	<ABC0379F-7153-465E-88F3-6ECC919D6D99@zope.com>	<4695B816.9020706@v.loewis.de>	<21756CBF-41A7-4906-AE5D-6F45E879BFEC@zope.com>	<4696988C.6050309@v.loewis.de>	<C4474D36-81FB-4B15-98B2-1C68AD495462@zope.com>	<46973211.1060801@v.loewis.de>
	<2C7890C2-A76C-4F33-AE22-97257A74E3DF@zope.com>
Message-ID: <4697D796.5080803@v.loewis.de>

> It appears that this only shows new releases.  If I update a new  
> distribution to a release, it doesn't cause the release to appear as  
> updated. A common scenario for me is that I'll create a release,  
> update a source release, and then, some time later, when someone bugs  
> me, I'll upload a windows egg.  The way things are now, the later  
> upload won't be noticed. Of course, the initial upload won't  be  
> noticed if someone happens to poll between release creation and the  
> first upload.

Ok, I added another operation "changelog", that gives you four-tuples
name, version, timestamp, action. It's the complete journal, except
that privacy fields (author and IP) are not returned, and except
changes to the package (rather than a specific release) are not
returned.

The possible values for "action" remain undocumented. If there is
interested, people can propose a specification that PyPI should
try to stick to; this specification should allow for
still-undocumented action values (to allow addition of more actions).

Regards,
Martin

From gentoodev at gmail.com  Tue Jul 17 08:45:14 2007
From: gentoodev at gmail.com (Rob Cakebread)
Date: Mon, 16 Jul 2007 23:45:14 -0700
Subject: [Catalog-sig] PyPI command-line tool: yolk
Message-ID: <9b06ffb10707162345s7813d59dpc61b758b50d3df66@mail.gmail.com>

yolk 0.3.0 has been released and lets you use the new PyPI XML-RPC
methods 'changelog' and 'updated_releases'.

You can see the latest releases for the last <hours>:

yolk -L 24

You can see a detailed ChangeLog of The Cheese Shop by the last <hours>:

yolk -C 6


http://tools.assembla.com/yolk

From stuart at stuartbishop.net  Wed Jul 18 11:58:11 2007
From: stuart at stuartbishop.net (Stuart Bishop)
Date: Wed, 18 Jul 2007 16:58:11 +0700
Subject: [Catalog-sig] start on static generation,
 and caching -    apache config.
In-Reply-To: <469522D6.1070706@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>	<46931A3A.5000703@v.loewis.de>	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>	<4693FA2A.3020107@v.loewis.de>	<20070710221547.4A3043A40A4@sparrow.telecommunity.com>	<469467AA.7070409@v.loewis.de>	<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
	<469522D6.1070706@v.loewis.de>
Message-ID: <469DE433.4040405@stuartbishop.net>

Martin v. L?wis wrote:
>> The questions for us is, how much effort we are willing to make to
>> prevent people from shooting themselves in the foot.  I can understand
>> why Phillip would like the package index to prevent people from choosing
>> problematic package names.
> 
> That's not my understanding - the issue isn't with "problematic package
> names", but with conflicting package names. IOW, any single name is
> fine - it's a pair of names that would cause a problem (and only if
> you wanted to install both packages on the same system).

By not blocking registration of packages with similar names, we are creating
a security problem. If there is a popular package 'CoolStuff', I just have
to upload a trojan 'coolstuff' and suddenly people will end up using my
trojan which they thought was coming from a trusted source. I think this
attack vector is possible right now and only a BUGTRAQ post away from being
common knowledge.

I think blocking this is the responsibility of the package index, as it is
the first point that it is possible to do so.

I think a reasonable restriction would be printable ASCII only names and not
allowing registration of a package with a name differing only in case,
whitespace or punctuation.

There are additional side benefits that fall out of this (being able
optimize searches by doing exact matches rather than fuzzy, or avoiding
whole classes of case-sensitivity or Unicode bugs in other applications
integrating with the registry, or reducing confusion to end users, or
reducing the likely hood of less user-hostile systems being developed and
making the official registry irrelevant - heck, I work on a closed source
system that would happily take the business).

-- 
Stuart Bishop <stuart at stuartbishop.net>
http://www.stuartbishop.net/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/catalog-sig/attachments/20070718/e971e102/attachment.pgp 

From martin at v.loewis.de  Thu Jul 19 00:07:30 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu, 19 Jul 2007 00:07:30 +0200
Subject: [Catalog-sig] start on static generation,
 and caching -    apache config.
In-Reply-To: <469DE433.4040405@stuartbishop.net>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>	<46931A3A.5000703@v.loewis.de>	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>	<4693FA2A.3020107@v.loewis.de>	<20070710221547.4A3043A40A4@sparrow.telecommunity.com>	<469467AA.7070409@v.loewis.de>	<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
	<469522D6.1070706@v.loewis.de> <469DE433.4040405@stuartbishop.net>
Message-ID: <469E8F22.7080204@v.loewis.de>

> I think blocking this is the responsibility of the package index, as it is
> the first point that it is possible to do so.

Would you like to contribute a patch?

Regards,
Martin

From stuart at stuartbishop.net  Thu Jul 19 06:00:53 2007
From: stuart at stuartbishop.net (Stuart Bishop)
Date: Thu, 19 Jul 2007 11:00:53 +0700
Subject: [Catalog-sig] start on static generation,
 and caching -    apache config.
In-Reply-To: <469E8F22.7080204@v.loewis.de>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>	<46931A3A.5000703@v.loewis.de>	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>	<4693FA2A.3020107@v.loewis.de>	<20070710221547.4A3043A40A4@sparrow.telecommunity.com>	<469467AA.7070409@v.loewis.de>	<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
	<469522D6.1070706@v.loewis.de> <469DE433.4040405@stuartbishop.net>
	<469E8F22.7080204@v.loewis.de>
Message-ID: <469EE1F5.7000802@stuartbishop.net>

Martin v. L?wis wrote:
>> I think blocking this is the responsibility of the package index, as it is
>> the first point that it is possible to do so.
> 
> Would you like to contribute a patch?

Yes, but it would be rather pointless to make one if my analysis is
incorrect or it would be bounced for some non-technical reason so I emailed
it for discussion. I'm also unsure if switching to exact matching on a
normalized string instead of substring matching is good (well... it is good
for performance, but might not be good for UI).

I haven't looked at the source code to see how much work is involved yet -
if I find the Python code incomprehensible I should at least be able to do
the PostgreSQL side of things.

-- 
Stuart Bishop <stuart at stuartbishop.net>
http://www.stuartbishop.net/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/catalog-sig/attachments/20070719/5e71e1d1/attachment.pgp 

From martin at v.loewis.de  Thu Jul 19 09:17:15 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu, 19 Jul 2007 09:17:15 +0200
Subject: [Catalog-sig] Package naming (Was: start on static generation,
 and caching -    apache config.)
In-Reply-To: <469EE1F5.7000802@stuartbishop.net>
References: <64ddb72c0707062124r30aa1b06k88838e06d73c45bd@mail.gmail.com>	<468F3CD4.1070501@v.loewis.de>	<64ddb72c0707070038j3970e379gabae0060f48036ea@mail.gmail.com>	<64ddb72c0707070203q39247d4axf183292e0b00225@mail.gmail.com>	<468FC2BB.7030607@v.loewis.de>	<6B8CDD2C-4F02-4836-89FA-0D00EAFE0F74@zope.com>	<468FF69B.2090503@v.loewis.de>	<057B56A0-CA3D-4EF3-B34D-A3174FE3B72C@zope.com>	<46910BBF.3010308@v.loewis.de>	<A8FA6ECA-F5C8-4668-BD54-968017E44782@zope.com>	<4692B3A3.5030209@v.loewis.de>	<20070710003214.A2EA83A404D@sparrow.telecommunity.com>	<46931A3A.5000703@v.loewis.de>	<20070710141304.BC6903A40A4@sparrow.telecommunity.com>	<4693FA2A.3020107@v.loewis.de>	<20070710221547.4A3043A40A4@sparrow.telecommunity.com>	<469467AA.7070409@v.loewis.de>	<7605F808-8C05-4735-A8E9-F2663083F4F5@zope.com>
	<469522D6.1070706@v.loewis.de> <469DE433.4040405@stuartbishop.net>
	<469E8F22.7080204@v.loewis.de> <469EE1F5.7000802@stuartbishop.net>
Message-ID: <469F0FFB.6010904@v.loewis.de>

> Yes, but it would be rather pointless to make one if my analysis is
> incorrect or it would be bounced for some non-technical reason so I emailed
> it for discussion. I'm also unsure if switching to exact matching on a
> normalized string instead of substring matching is good (well... it is good
> for performance, but might not be good for UI).

That's something completely different. I thought you were saying that
the Cheeseshop should block conflicting registrations. To implement
that, you only have to perform any normalization when a new project
is registered. There are roughly three new registrations per day, so
performance is irrelevant here.

Matching on lookup is rather a convenience to users; they can put
in a misspelled string and still find the package. OTOH, the search
interface already does case-insensitive matching; I doubt that doing
it in the URL adds much convenience. OTOH, it does add performance
(not convenience) to setuptools users, as setuptools could stop
downloading the complete package list to find the match.

But these are unrelated; if you want to contribute, it might be
best to just focus on the part that really worries you (namely
the security risk of conflicting registrations).

Regards,
Martin


From jim at zope.com  Thu Jul 19 13:06:34 2007
From: jim at zope.com (Jim Fulton)
Date: Thu, 19 Jul 2007 07:06:34 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
Message-ID: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>

Over the past few months, we've struggled quite a bit with Python  
Package Index (PyPI) performance and stability.  Thanks to the heroic  
efforts of Martin v. L?wis and others, performance and especially  
stability have improved quite a bit. Martin has demonstrated that, at  
least when running well, PyPI seems to answer most requests on the  
order of 7 miliseconds (around 150 requests per second) internally.   
That's not bad.  Unfortunately for users, actual times can be quite a  
bit longer.  For me at work, request take around 300 milliseconds.   
For Martin, they seem to take somewhat longer.  300 milliseconds  
isn't so bad for a request or two, however, easy install can easily  
make 10s or even hundreds of requests to satisfy a user request for a  
package.  zc.buildout, when verifying that a large system with many  
tens of packages has the most up to date versions of each package can  
easily make thousands of requests.

Why do setuptools and buildout make so many requests?  If a package  
exposes more than one release, then setuptools checks the package's  
main PyPI page and the pages for each release.  We need to be able to  
easily use older releases, so we can't hide old releases.  Typical  
projects of ours have many old releases exposed.  If setuptools was  
more clever in the way it searched PyPI, but it would still have to  
make a minimum of 2 requests per package for packages with multiple  
versions exposed.

Another potential issue is that PyPI pages can be large.  I've found  
it convenient to use PyPI package pages as the home page for many of  
my projects.  I like to include package documentation in my project  
pages.  Perhaps this is an abuse of PyPI, but it is very convenient  
for me and no one has complained. :)  The zc.buildout pages are  
around 200K.  That's a fair bit of data for setuptools to download  
and scan for download URLs.

In the course of this discussion, I've realized that it doesn't make  
sense for setuptools to use the same interface that humans use.   
setuptools doesn't need to see all of the data that is useful to  
humans. Similarly, humans generally don't need to see all of the  
historical releases for a project.  I suggested a simple page format  
designed just for setuptools.  An alternative would be an xmlrpc  
API.  I prefer pages because I think that, over time, the amount of  
requests from automated tools like easy_install and zc.buildout will  
increase substantially and ultimately, will overwhelm dynamic  
servers, even ones like PyPI that are reasonably fast.  I also think  
that a simple static collection of pages will be easier to mirror and  
I think some number of geographic mirrors is likely to help some  
people.  I promised to prototype the format I suggested.

I've created and experimental prototype setuptools-specific package  
index at

   http://download.zope.org/ppix

Going to that page gives brief instructions for using it with  
easy_install and zc.buildout.  To see an individual package page, add  
the package name to the URL, as in:

   http://download.zope.org/ppix/setuptools/

A few things to note about this:

- I don't expose a long package list at http://download.zope.org/ 
ppix/.  The long package list would be expensive to download and  
supports a use case that I consider to be of negative value, which is  
installing packages with case-insensitive package names,  I think it  
is important for humans to be able to search for packages using case- 
insensitive search terms, but I think that, after identifying a  
package, precise package names should be used.  I think it is  
especially important that precise package names be used in package  
requirements.

- There is a single page per package.  This can greatly reduce the  
number of requests.  Packages that store all of their distributions  
in PyPI and that don't have off-site home pages or download URLs can  
be scanned with a single request.  Note that I excluded home page and  
download URLs that pointed back to the packages PyPI page, as that  
wouldn't provide any new information to setuptools.

- Download URLs for *hidden* packages are included.  Humans don't  
need to see old revisions, but setuptools-based tools do.  If we used  
an index like this for setuptools, we could stop unhiding old  
releases when we created new releases in PyPI.  This would make PyPI  
more useful to humans and less of a pain for developers.

- Download URLs are the same as they are in PyPI.  Using this new  
index, distributions are still downloaded from PyPI, so the index  
doesn't affect PyPI download statistics.

To see the impact of this, it's interesting to look at installing  
zc.buildout using easy_install from PyPI and from the experimental  
index:
Installing using PyPI looks like this:

   (env)jim at ds9:~/tmp$ time easy_install zc.buildout
   Searching for zc.buildout
   Reading http://cheeseshop.python.org/pypi/zc.buildout/
   Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b19
   Reading http://svn.zope.org/zc.buildout
   Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b22
   Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b23
   Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b20
   Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b21
   Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b26
   Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b27
   Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b24
   Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b25
   Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b28
   Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b17
   Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b16
   Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b18
   Best match: zc.buildout 1.0.0b28
   Downloading http://cheeseshop.python.org/packages/2.5/z/ 
zc.buildout/zc.buildout-1.0.0b28- 
py2.5.egg#md5=4e37e53f010ed7984555a029732f479d
   Processing zc.buildout-1.0.0b28-py2.5.egg
   creating /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28- 
py2.5.egg
   Extracting zc.buildout-1.0.0b28-py2.5.egg to /home/jim/tmp/env/lib/ 
python2.5
   Adding zc.buildout 1.0.0b28 to easy-install.pth file
   Installing buildout script to /home/jim/tmp/env/bin/

   Installed /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28- 
py2.5.egg
   Processing dependencies for zc.buildout
   Searching for setuptools==0.6c6
   Best match: setuptools 0.6c6
   Processing setuptools-0.6c6-py2.5.egg
   Adding setuptools 0.6c6 to easy-install.pth file
   Installing easy_install script to /home/jim/tmp/env/bin/
   Installing easy_install-2.5 script to /home/jim/tmp/env/bin/

   Installed /home/jim/tmp/env/lib/python2.5/setuptools-0.6c6-py2.5.egg
   Processing dependencies for setuptools==0.6c6
   Finished processing dependencies for setuptools==0.6c6
   Finished installing setuptools==0.6c6
   Finished processing dependencies for zc.buildout
   Finished installing zc.buildout

   real	0m31.360s
   user	0m1.136s
   sys	0m0.060s

Note the large number of pages read.  Here I was installing a single  
package with one dependency, setuptools, that was already installed.  
Let's look at this again using the experimental index:

   (env)jim at ds9:~/tmp$ time easy_install -i http://download.zope.org/ 
ppix zc.buildout
   Searching for zc.buildout
   Reading http://download.zope.org/ppix/zc.buildout/
   Best match: zc.buildout 1.0.0b28
   Downloading http://cheeseshop.python.org/packages/2.5/z/ 
zc.buildout/zc.buildout-1.0.0b28- 
py2.5.egg#md5=4e37e53f010ed7984555a029732f479d
   Processing zc.buildout-1.0.0b28-py2.5.egg
   creating /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28- 
py2.5.egg
   Extracting zc.buildout-1.0.0b28-py2.5.egg to /home/jim/tmp/env/lib/ 
python2.5
   Adding zc.buildout 1.0.0b28 to easy-install.pth file
   Installing buildout script to /home/jim/tmp/env/bin/

   Installed /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28- 
py2.5.egg
   Processing dependencies for zc.buildout
   Searching for setuptools==0.6c6
   Best match: setuptools 0.6c6
   Processing setuptools-0.6c6-py2.5.egg
   Adding setuptools 0.6c6 to easy-install.pth file
   Installing easy_install script to /home/jim/tmp/env/bin/
   Installing easy_install-2.5 script to /home/jim/tmp/env/bin/

   Installed /home/jim/tmp/env/lib/python2.5/setuptools-0.6c6-py2.5.egg
   Processing dependencies for setuptools==0.6c6
   Finished processing dependencies for setuptools==0.6c6
   Finished installing setuptools==0.6c6
   Finished processing dependencies for zc.buildout
   Finished installing zc.buildout

   real	0m7.006s
   user	0m0.244s
   sys	0m0.040s

Note:

- We made far fewer requests with the new index

- Most of the time in the second example was spent actually  
downloading the buildout distribution.  Most of the time in the first  
example was spent reading the index.

- I used workingenv to create clean environments for each of the  
examples above.

WRT zc.buildout, refreshing a buildout with just ZODB installed in it  
takes about 45 seconds for me using PyPI and about 5 seconds using  
the experimental index.

Some of the speed improvements is due to the fact that the  
experimental index is much closer to me (on the net) than PyPI.  ATM,  
requests to PyPI take *me* around 500 milliseconds, while requests to  
the experimental index are taking between 100 and 300 milliseconds.  
(I'm at home and this seems to be somewhat variable.)  Most of the  
speed improvements are from reducing the number of requests.

I'm polling PyPI once a minute to get and apply updates. Thanks to  
the new XML-RPC method that Martin added, this is very efficient to do.

I encourage people to check this out and even try using it with  
easy_install and especially buildout. AFAIK, aside from being much  
faster and showing download files for hidden releases it is  
completely equivalent to PyPI for setuptools use.  My intension is to  
keep this experimental index going and up to date for the foreseeable  
future and plan to use it for all my work.

My primary goal is to prototype the new index format.  If this seems  
useful, then I think that www.python.org should expose an index in  
this format to setuptools, either at a different URL or by satisfying  
setuptools requests from the index based on client information.  I'd  
love to see this index populated via a baking mechanism that updates  
package pages when they change, rather than through polling as I'm  
doing.

There would be some benefit to having geographic mirrors.  I suspect  
that having such mirrors available would improve performance further,  
at least for some folks.  It might also be useful to have some  
mirrors for redundancy purposes.  Note though that what I'm doing is  
mirroring the only index data. I'm not mirroring distributions.  Of  
course, I'd be happy to make my software available. (It already is  
via our subversion repository.)

I hope this effort spurs useful discussion and progress.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From martin at v.loewis.de  Fri Jul 20 10:21:18 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 20 Jul 2007 10:21:18 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
Message-ID: <46A0707E.6000103@v.loewis.de>

> I've created and experimental prototype setuptools-specific package  
> index at
> 
>    http://download.zope.org/ppix

Cool! If this proves useful, people are encouraged to contribute the
proper patches to PyPI to regenerate the page directly on each log
change.

There is a slight transactional trickiness to doing so: If you
regenerate before the commit, it might be that the commit fails;
then you would have to rollback the page update, too. If you
regenerate after commit, it might be that you run into race
conditions if the same package sees two	updates in two
transactions very quickly, and the second regeneration completes
before the first one.

If people would find it easier to make these pages dynamic,
such patches would also be kindly accepted. Generating the
pages on access should be fairly cheap; the SQL is

select filename,md5_digest from release_files where name='setuptools';

and putting the result of that into an ppix-like HTML page
should be much faster than invoking ZPT.

Regards,
Martin


From ct at gocept.com  Fri Jul 20 12:02:45 2007
From: ct at gocept.com (Christian Theune)
Date: Fri, 20 Jul 2007 12:02:45 +0200
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
Message-ID: <1184925765.6519.3.camel@mindy>

Am Donnerstag, den 19.07.2007, 07:06 -0400 schrieb Jim Fulton:
> I promised to prototype the format I suggested.
> 
> I've created and experimental prototype setuptools-specific package  
> index at
> 
>    http://download.zope.org/ppix

Yay! This works like a charme!

> There would be some benefit to having geographic mirrors.  I suspect  
> that having such mirrors available would improve performance further,  
> at least for some folks.  It might also be useful to have some  
> mirrors for redundancy purposes.  Note though that what I'm doing is  
> mirroring the only index data. I'm not mirroring distributions.  Of  
> course, I'd be happy to make my software available. (It already is  
> via our subversion repository.)

I'd be happy to support mirroring once all this is sorted out/ I can
offer a server in Germany/Europe.

Christian


From jim at zope.com  Fri Jul 20 13:45:57 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 20 Jul 2007 07:45:57 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <46A0707E.6000103@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A0707E.6000103@v.loewis.de>
Message-ID: <5105308E-F651-438B-8C3D-F5FCAF8A8351@zope.com>


On Jul 20, 2007, at 4:21 AM, Martin v. L?wis wrote:

>> I've created and experimental prototype setuptools-specific package
>> index at
>>
>>    http://download.zope.org/ppix
>
> Cool! If this proves useful, people are encouraged to contribute the
> proper patches to PyPI to regenerate the page directly on each log
> change.
>
> There is a slight transactional trickiness to doing so: If you
> regenerate before the commit, it might be that the commit fails;
> then you would have to rollback the page update, too. If you
> regenerate after commit, it might be that you run into race
> conditions if the same package sees two	updates in two
> transactions very quickly, and the second regeneration completes
> before the first one.
>
> If people would find it easier to make these pages dynamic,
> such patches would also be kindly accepted. Generating the
> pages on access should be fairly cheap; the SQL is
>
> select filename,md5_digest from release_files where name='setuptools';
>
> and putting the result of that into an ppix-like HTML page
> should be much faster than invoking ZPT.

A few notes.

It is important to show files from hidden releases as well as  
unhidden releases.  I suspect the select statement above does that.

I parse long descriptions to get #egg= links.  I also give some  
special care to urls that point back to PyPI to avoid having  
setuptools go back to the human interface.

It might be easiest to just trigger the existing ppix sw to poll  
after a change.  Thanks to your xmlrpc addition, polling is quite  
cheap.  Alternatively, we could install the existing software in a  
way that polls more or less continuously.  This would be quite  
trivial.  What you suggest is probably cleaner but requires some  
expertise with the current software. :)  I'd much rather generate  
static files (as I'm doing now) than serve these dynamically.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Fri Jul 20 13:48:39 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 20 Jul 2007 07:48:39 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <1184925765.6519.3.camel@mindy>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<1184925765.6519.3.camel@mindy>
Message-ID: <465B76C9-D7D2-420E-BBBB-E7F24F6FA710@zope.com>


On Jul 20, 2007, at 6:02 AM, Christian Theune wrote:
...
> I'd be happy to support mirroring once all this is sorted out/ I can
> offer a server in Germany/Europe.

If we decide that mirrors would be a good idea, it will be important,  
imo, to select mirror sites bases on their connectivity.  The goal of  
the mirrors should be to try to give people options with short  
network distances.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From ct at gocept.com  Fri Jul 20 13:52:12 2007
From: ct at gocept.com (Christian Theune)
Date: Fri, 20 Jul 2007 13:52:12 +0200
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <465B76C9-D7D2-420E-BBBB-E7F24F6FA710@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<1184925765.6519.3.camel@mindy>
	<465B76C9-D7D2-420E-BBBB-E7F24F6FA710@zope.com>
Message-ID: <1184932332.6519.11.camel@mindy>

Am Freitag, den 20.07.2007, 07:48 -0400 schrieb Jim Fulton:
> On Jul 20, 2007, at 6:02 AM, Christian Theune wrote:
> ...
> > I'd be happy to support mirroring once all this is sorted out/ I can
> > offer a server in Germany/Europe.
> 
> If we decide that mirrors would be a good idea, it will be important,  
> imo, to select mirror sites bases on their connectivity.  The goal of  
> the mirrors should be to try to give people options with short  
> network distances.

Right, however, do you have any specific parameters that can be measured
in mind?

(Our server is reasonably well connected, reachable with about 5 hops
from within Germany with latency around 40ms on a DSL line. Multiple
GBit lines to the hosting center.) 

Christian




From jodok at lovelysystems.com  Fri Jul 20 10:50:40 2007
From: jodok at lovelysystems.com (Jodok Batlogg)
Date: Fri, 20 Jul 2007 10:50:40 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
Message-ID: <6000B516-6593-4A98-AA08-B6C7B329BC62@lovelysystems.com>

thanks jim.

you save our day. we'll send some austrian cheese over :)

jodok

On 19.07.2007, at 13:06, Jim Fulton wrote:

> Over the past few months, we've struggled quite a bit with Python
> Package Index (PyPI) performance and stability.  Thanks to the heroic
> efforts of Martin v. L?wis and others, performance and especially
> stability have improved quite a bit. Martin has demonstrated that, at
> least when running well, PyPI seems to answer most requests on the
> order of 7 miliseconds (around 150 requests per second) internally.
> That's not bad.  Unfortunately for users, actual times can be quite a
> bit longer.  For me at work, request take around 300 milliseconds.
> For Martin, they seem to take somewhat longer.  300 milliseconds
> isn't so bad for a request or two, however, easy install can easily
> make 10s or even hundreds of requests to satisfy a user request for a
> package.  zc.buildout, when verifying that a large system with many
> tens of packages has the most up to date versions of each package can
> easily make thousands of requests.
>
> Why do setuptools and buildout make so many requests?  If a package
> exposes more than one release, then setuptools checks the package's
> main PyPI page and the pages for each release.  We need to be able to
> easily use older releases, so we can't hide old releases.  Typical
> projects of ours have many old releases exposed.  If setuptools was
> more clever in the way it searched PyPI, but it would still have to
> make a minimum of 2 requests per package for packages with multiple
> versions exposed.
>
> Another potential issue is that PyPI pages can be large.  I've found
> it convenient to use PyPI package pages as the home page for many of
> my projects.  I like to include package documentation in my project
> pages.  Perhaps this is an abuse of PyPI, but it is very convenient
> for me and no one has complained. :)  The zc.buildout pages are
> around 200K.  That's a fair bit of data for setuptools to download
> and scan for download URLs.
>
> In the course of this discussion, I've realized that it doesn't make
> sense for setuptools to use the same interface that humans use.
> setuptools doesn't need to see all of the data that is useful to
> humans. Similarly, humans generally don't need to see all of the
> historical releases for a project.  I suggested a simple page format
> designed just for setuptools.  An alternative would be an xmlrpc
> API.  I prefer pages because I think that, over time, the amount of
> requests from automated tools like easy_install and zc.buildout will
> increase substantially and ultimately, will overwhelm dynamic
> servers, even ones like PyPI that are reasonably fast.  I also think
> that a simple static collection of pages will be easier to mirror and
> I think some number of geographic mirrors is likely to help some
> people.  I promised to prototype the format I suggested.
>
> I've created and experimental prototype setuptools-specific package
> index at
>
>    http://download.zope.org/ppix
>
> Going to that page gives brief instructions for using it with
> easy_install and zc.buildout.  To see an individual package page, add
> the package name to the URL, as in:
>
>    http://download.zope.org/ppix/setuptools/
>
> A few things to note about this:
>
> - I don't expose a long package list at http://download.zope.org/
> ppix/.  The long package list would be expensive to download and
> supports a use case that I consider to be of negative value, which is
> installing packages with case-insensitive package names,  I think it
> is important for humans to be able to search for packages using case-
> insensitive search terms, but I think that, after identifying a
> package, precise package names should be used.  I think it is
> especially important that precise package names be used in package
> requirements.
>
> - There is a single page per package.  This can greatly reduce the
> number of requests.  Packages that store all of their distributions
> in PyPI and that don't have off-site home pages or download URLs can
> be scanned with a single request.  Note that I excluded home page and
> download URLs that pointed back to the packages PyPI page, as that
> wouldn't provide any new information to setuptools.
>
> - Download URLs for *hidden* packages are included.  Humans don't
> need to see old revisions, but setuptools-based tools do.  If we used
> an index like this for setuptools, we could stop unhiding old
> releases when we created new releases in PyPI.  This would make PyPI
> more useful to humans and less of a pain for developers.
>
> - Download URLs are the same as they are in PyPI.  Using this new
> index, distributions are still downloaded from PyPI, so the index
> doesn't affect PyPI download statistics.
>
> To see the impact of this, it's interesting to look at installing
> zc.buildout using easy_install from PyPI and from the experimental
> index:
> Installing using PyPI looks like this:
>
>    (env)jim at ds9:~/tmp$ time easy_install zc.buildout
>    Searching for zc.buildout
>    Reading http://cheeseshop.python.org/pypi/zc.buildout/
>    Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b19
>    Reading http://svn.zope.org/zc.buildout
>    Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b22
>    Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b23
>    Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b20
>    Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b21
>    Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b26
>    Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b27
>    Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b24
>    Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b25
>    Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b28
>    Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b17
>    Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b16
>    Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b18
>    Best match: zc.buildout 1.0.0b28
>    Downloading http://cheeseshop.python.org/packages/2.5/z/
> zc.buildout/zc.buildout-1.0.0b28-
> py2.5.egg#md5=4e37e53f010ed7984555a029732f479d
>    Processing zc.buildout-1.0.0b28-py2.5.egg
>    creating /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28-
> py2.5.egg
>    Extracting zc.buildout-1.0.0b28-py2.5.egg to /home/jim/tmp/env/lib/
> python2.5
>    Adding zc.buildout 1.0.0b28 to easy-install.pth file
>    Installing buildout script to /home/jim/tmp/env/bin/
>
>    Installed /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28-
> py2.5.egg
>    Processing dependencies for zc.buildout
>    Searching for setuptools==0.6c6
>    Best match: setuptools 0.6c6
>    Processing setuptools-0.6c6-py2.5.egg
>    Adding setuptools 0.6c6 to easy-install.pth file
>    Installing easy_install script to /home/jim/tmp/env/bin/
>    Installing easy_install-2.5 script to /home/jim/tmp/env/bin/
>
>    Installed /home/jim/tmp/env/lib/python2.5/setuptools-0.6c6- 
> py2.5.egg
>    Processing dependencies for setuptools==0.6c6
>    Finished processing dependencies for setuptools==0.6c6
>    Finished installing setuptools==0.6c6
>    Finished processing dependencies for zc.buildout
>    Finished installing zc.buildout
>
>    real	0m31.360s
>    user	0m1.136s
>    sys	0m0.060s
>
> Note the large number of pages read.  Here I was installing a single
> package with one dependency, setuptools, that was already installed.
> Let's look at this again using the experimental index:
>
>    (env)jim at ds9:~/tmp$ time easy_install -i http://download.zope.org/
> ppix zc.buildout
>    Searching for zc.buildout
>    Reading http://download.zope.org/ppix/zc.buildout/
>    Best match: zc.buildout 1.0.0b28
>    Downloading http://cheeseshop.python.org/packages/2.5/z/
> zc.buildout/zc.buildout-1.0.0b28-
> py2.5.egg#md5=4e37e53f010ed7984555a029732f479d
>    Processing zc.buildout-1.0.0b28-py2.5.egg
>    creating /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28-
> py2.5.egg
>    Extracting zc.buildout-1.0.0b28-py2.5.egg to /home/jim/tmp/env/lib/
> python2.5
>    Adding zc.buildout 1.0.0b28 to easy-install.pth file
>    Installing buildout script to /home/jim/tmp/env/bin/
>
>    Installed /home/jim/tmp/env/lib/python2.5/zc.buildout-1.0.0b28-
> py2.5.egg
>    Processing dependencies for zc.buildout
>    Searching for setuptools==0.6c6
>    Best match: setuptools 0.6c6
>    Processing setuptools-0.6c6-py2.5.egg
>    Adding setuptools 0.6c6 to easy-install.pth file
>    Installing easy_install script to /home/jim/tmp/env/bin/
>    Installing easy_install-2.5 script to /home/jim/tmp/env/bin/
>
>    Installed /home/jim/tmp/env/lib/python2.5/setuptools-0.6c6- 
> py2.5.egg
>    Processing dependencies for setuptools==0.6c6
>    Finished processing dependencies for setuptools==0.6c6
>    Finished installing setuptools==0.6c6
>    Finished processing dependencies for zc.buildout
>    Finished installing zc.buildout
>
>    real	0m7.006s
>    user	0m0.244s
>    sys	0m0.040s
>
> Note:
>
> - We made far fewer requests with the new index
>
> - Most of the time in the second example was spent actually
> downloading the buildout distribution.  Most of the time in the first
> example was spent reading the index.
>
> - I used workingenv to create clean environments for each of the
> examples above.
>
> WRT zc.buildout, refreshing a buildout with just ZODB installed in it
> takes about 45 seconds for me using PyPI and about 5 seconds using
> the experimental index.
>
> Some of the speed improvements is due to the fact that the
> experimental index is much closer to me (on the net) than PyPI.  ATM,
> requests to PyPI take *me* around 500 milliseconds, while requests to
> the experimental index are taking between 100 and 300 milliseconds.
> (I'm at home and this seems to be somewhat variable.)  Most of the
> speed improvements are from reducing the number of requests.
>
> I'm polling PyPI once a minute to get and apply updates. Thanks to
> the new XML-RPC method that Martin added, this is very efficient to  
> do.
>
> I encourage people to check this out and even try using it with
> easy_install and especially buildout. AFAIK, aside from being much
> faster and showing download files for hidden releases it is
> completely equivalent to PyPI for setuptools use.  My intension is to
> keep this experimental index going and up to date for the foreseeable
> future and plan to use it for all my work.
>
> My primary goal is to prototype the new index format.  If this seems
> useful, then I think that www.python.org should expose an index in
> this format to setuptools, either at a different URL or by satisfying
> setuptools requests from the index based on client information.  I'd
> love to see this index populated via a baking mechanism that updates
> package pages when they change, rather than through polling as I'm
> doing.
>
> There would be some benefit to having geographic mirrors.  I suspect
> that having such mirrors available would improve performance further,
> at least for some folks.  It might also be useful to have some
> mirrors for redundancy purposes.  Note though that what I'm doing is
> mirroring the only index data. I'm not mirroring distributions.  Of
> course, I'd be happy to make my software available. (It already is
> via our subversion repository.)
>
> I hope this effort spurs useful discussion and progress.
>
> Jim
>
> --
> Jim Fulton			mailto:jim at zope.com		Python Powered!
> CTO 				(540) 361-1714			http://www.python.org
> Zope Corporation	http://www.zope.com		http://www.zope.org
>
>
>
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig

--
"Although never is often better than *right* now."
   -- The Zen of Python, by Tim Peters

Jodok Batlogg, Lovely Systems
Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria
phone: +43 5572 908060, fax: +43 5572 908060-77


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2454 bytes
Desc: not available
Url : http://mail.python.org/pipermail/catalog-sig/attachments/20070720/26d28f4b/attachment-0001.bin 

From jim at zope.com  Fri Jul 20 15:42:37 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 20 Jul 2007 09:42:37 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <1184932332.6519.11.camel@mindy>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<1184925765.6519.3.camel@mindy>
	<465B76C9-D7D2-420E-BBBB-E7F24F6FA710@zope.com>
	<1184932332.6519.11.camel@mindy>
Message-ID: <5686B35D-34DD-49FE-A8E7-37397A4AE808@zope.com>


On Jul 20, 2007, at 7:52 AM, Christian Theune wrote:

> Am Freitag, den 20.07.2007, 07:48 -0400 schrieb Jim Fulton:
>> On Jul 20, 2007, at 6:02 AM, Christian Theune wrote:
>> ...
>>> I'd be happy to support mirroring once all this is sorted out/ I can
>>> offer a server in Germany/Europe.
>>
>> If we decide that mirrors would be a good idea, it will be important,
>> imo, to select mirror sites bases on their connectivity.  The goal of
>> the mirrors should be to try to give people options with short
>> network distances.
>
> Right, however, do you have any specific parameters that can be  
> measured
> in mind?

I'm not enough of a network expert.  Hopefully, someone more  
knowledgeable will make a suggestion.  BTW, with the current PyPI  
performance, I'm guessing we could have 10s of mirrors poll once a  
minute without affecting other users.

> (Our server is reasonably well connected, reachable with about 5 hops
> from within Germany with latency around 40ms on a DSL line. Multiple
> GBit lines to the hosting center.)

I didn't mean to suggest that you weren't well connected.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From pje at telecommunity.com  Fri Jul 20 22:09:39 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 20 Jul 2007 16:09:39 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
Message-ID: <20070720200721.88E1D3A403A@sparrow.telecommunity.com>

At 07:06 AM 7/19/2007 -0400, Jim Fulton wrote:
>I've created and experimental prototype setuptools-specific package
>index at
>
>    http://download.zope.org/ppix
>
>Going to that page gives brief instructions for using it with
>easy_install and zc.buildout.

FYI, the handling of homepage and download links is broken.  You have 
e.g. 'meta="homepage"' instead of 'rel="homepage"', so easy_install 
doesn't pick these up and look for links there, meaning that ppix 
fails to find downloads for e.g. pywin32 which is hosted at Sourceforge.

(On a perhaps not entirely unrelated note, the Cheeseshop appears to 
be down at the moment:

"""Error...

There's been a problem with your request

psycopg.OperationalError: no connection to the server""")


By the way, I'd suggest explaining (or linking to an explanation) on 
the ppix main page describing how to configure easy_install such that 
the '-i' option isn't necessary.  Perhaps we could add an example to 
the EasyInstall docs somewhere near:

http://peak.telecommunity.com/DevCenter/EasyInstall#creating-your-own-package-index

and then link to it from the ppix page.


From jim at zope.com  Fri Jul 20 22:07:08 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 20 Jul 2007 16:07:08 -0400
Subject: [Catalog-sig] PyPI is down with a psycopg error
Message-ID: <81A19504-87CD-412E-9D9A-5CE52C86EA68@zope.com>

Requests to http://www.python.org/pypi are giving:

   Error...

   There's been a problem with your request

   psycopg.OperationalError: no connection to the server

This (or something like it) has been happening since 7:54 UTC. I know  
because my once a minute cron job to update ppix has been failing  
since then. :)

The good news is that folks who have switched to using http:// 
download.zope.org/ppix/ for setuptools (easy_install and buildout)  
are unaffected.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Fri Jul 20 22:18:55 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 20 Jul 2007 16:18:55 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <20070720200721.88E1D3A403A@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<20070720200721.88E1D3A403A@sparrow.telecommunity.com>
Message-ID: <24B11DD1-DD79-4171-A38F-06B642EC354B@zope.com>


On Jul 20, 2007, at 4:09 PM, Phillip J. Eby wrote:

> At 07:06 AM 7/19/2007 -0400, Jim Fulton wrote:
>> I've created and experimental prototype setuptools-specific package
>> index at
>>
>>    http://download.zope.org/ppix
>>
>> Going to that page gives brief instructions for using it with
>> easy_install and zc.buildout.
>
> FYI, the handling of homepage and download links is broken.  You  
> have e.g. 'meta="homepage"' instead of 'rel="homepage"', so  
> easy_install doesn't pick these up and look for links there,  
> meaning that ppix fails to find downloads for e.g. pywin32 which is  
> hosted at Sourceforge.

Doh! Fixed.


> (On a perhaps not entirely unrelated note, the Cheeseshop appears  
> to be down at the moment:
>
> """Error...
>
> There's been a problem with your request
>
> psycopg.OperationalError: no connection to the server""")
>
>
> By the way, I'd suggest explaining (or linking to an explanation)  
> on the ppix main page describing how to configure easy_install such  
> that the '-i' option isn't necessary.

If you send me some text, I'd be happy to add it to the ppix main page.


>   Perhaps we could add an example to the EasyInstall docs somewhere  
> near:
>
> http://peak.telecommunity.com/DevCenter/EasyInstall#creating-your- 
> own-package-index
>

> and then link to it from the ppix page.

+1

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From benji at benjiyork.com  Fri Jul 20 22:04:29 2007
From: benji at benjiyork.com (Benji York)
Date: Fri, 20 Jul 2007 16:04:29 -0400
Subject: [Catalog-sig] Cheeseshop down
Message-ID: <46A1154D.7000708@benjiyork.com>

Fulfilling my dutifully sworn obligation to report every instance of 
PYPI being down:

"""
Error...

There's been a problem with your request

psycopg.OperationalError: no connection to the server
"""
-- 
Benji York
http://benjiyork.com

From bray at sent.com  Fri Jul 20 23:01:16 2007
From: bray at sent.com (Brian Ray)
Date: Fri, 20 Jul 2007 16:01:16 -0500
Subject: [Catalog-sig] PyPI is down with a psycopg error
In-Reply-To: <81A19504-87CD-412E-9D9A-5CE52C86EA68@zope.com>
References: <81A19504-87CD-412E-9D9A-5CE52C86EA68@zope.com>
Message-ID: <B40F96BB-62DA-4C66-ACEB-74806C39C249@sent.com>


On Jul 20, 2007, at 3:07 PM, Jim Fulton wrote:

>
>    Error...
>
>    There's been a problem with your request
>
>    psycopg.OperationalError: no connection to the server
>

Come on!

Still down.

Not Good.  Does anybody know a short term fix and a long term solution.


Brian Ray
bray at sent.com


From jim at zope.com  Fri Jul 20 23:10:26 2007
From: jim at zope.com (Jim Fulton)
Date: Fri, 20 Jul 2007 17:10:26 -0400
Subject: [Catalog-sig] PyPI is down with a psycopg error
In-Reply-To: <B40F96BB-62DA-4C66-ACEB-74806C39C249@sent.com>
References: <81A19504-87CD-412E-9D9A-5CE52C86EA68@zope.com>
	<B40F96BB-62DA-4C66-ACEB-74806C39C249@sent.com>
Message-ID: <F468092D-F97A-4108-8379-7E6E98E84CFA@zope.com>


On Jul 20, 2007, at 5:01 PM, Brian Ray wrote:

>
> On Jul 20, 2007, at 3:07 PM, Jim Fulton wrote:
>
>>
>>    Error...
>>
>>    There's been a problem with your request
>>
>>    psycopg.OperationalError: no connection to the server
>>
>
> Come on!
>
> Still down.
>
> Not Good.  Does anybody know a short term fix and a long term  
> solution.

If you're using it for easy_install or buildout, use http:// 
download.zope.org/ppix as your package index.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From richardjones at optushome.com.au  Sat Jul 21 01:34:34 2007
From: richardjones at optushome.com.au (Richard Jones)
Date: Sat, 21 Jul 2007 09:34:34 +1000
Subject: [Catalog-sig] PyPI is down with a psycopg error
In-Reply-To: <B40F96BB-62DA-4C66-ACEB-74806C39C249@sent.com>
References: <81A19504-87CD-412E-9D9A-5CE52C86EA68@zope.com>
	<B40F96BB-62DA-4C66-ACEB-74806C39C249@sent.com>
Message-ID: <200707210934.34159.richardjones@optushome.com.au>

On Sat, 21 Jul 2007, Brian Ray wrote:
> On Jul 20, 2007, at 3:07 PM, Jim Fulton wrote:
> >    Error...
> >
> >    There's been a problem with your request
> >
> >    psycopg.OperationalError: no connection to the server
>
> Come on!

Yes, because complaining about it will fix it.

Postgres is up and running, but the web interface is reporting the above 
errors as though it can't connect. I can only assume that the persistent 
connection has run into trouble. I've disabled persistent connections in the 
fcgi config, but now apache will need restarting. I'm trying to contact 
someone who can do that.


> Not Good.  Does anybody know a short term fix and a long term solution.

You can volunteer to also be a maintainer of the system.


     Richard

From richardjones at optushome.com.au  Sat Jul 21 02:08:35 2007
From: richardjones at optushome.com.au (Richard Jones)
Date: Sat, 21 Jul 2007 10:08:35 +1000
Subject: [Catalog-sig] PyPI is down with a psycopg error
In-Reply-To: <200707210934.34159.richardjones@optushome.com.au>
References: <81A19504-87CD-412E-9D9A-5CE52C86EA68@zope.com>
	<B40F96BB-62DA-4C66-ACEB-74806C39C249@sent.com>
	<200707210934.34159.richardjones@optushome.com.au>
Message-ID: <200707211008.35954.richardjones@optushome.com.au>

On Sat, 21 Jul 2007, Richard Jones wrote:
> I'm trying to contact someone who can do that.

It looks like one of the volunteer sysadmins has now restarted apache and the 
database connection issues are no more.


    Richard

From martin at v.loewis.de  Sat Jul 21 08:05:13 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 21 Jul 2007 08:05:13 +0200
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
 index.
In-Reply-To: <20070720200721.88E1D3A403A@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<20070720200721.88E1D3A403A@sparrow.telecommunity.com>
Message-ID: <46A1A219.60906@v.loewis.de>

> (On a perhaps not entirely unrelated note, the Cheeseshop appears to 
> be down at the moment:
> 
> """Error...
> 
> There's been a problem with your request
> 
> psycopg.OperationalError: no connection to the server""")

Around that time, the Postgres log has these entries:

2007-07-20 21:53:24 [14636] LOG:  received fast shutdown request
2007-07-20 21:53:24 [14636] LOG:  aborting any active transactions
2007-07-20 21:53:24 [26166] FATAL:  terminating connection due to
administrator command
2007-07-20 21:53:24 [15769] FATAL:  terminating connection due to
administrator command
2007-07-20 21:53:24 [10390] FATAL:  terminating connection due to
administrator command
2007-07-20 21:53:24 [31182] FATAL:  terminating connection due to
administrator command
2007-07-20 21:53:24 [30066] FATAL:  terminating connection due to
administrator command
2007-07-20 21:53:24 [10162] FATAL:  terminating connection due to
administrator command
2007-07-20 21:53:24 [17452] FATAL:  terminating connection due to
administrator command
2007-07-20 21:53:24 [17147] FATAL:  terminating connection due to
administrator command
2007-07-20 21:53:24 [1159] LOG:  shutting down
2007-07-20 21:53:26 [1159] LOG:  database system is shut down
2007-07-20 21:53:33 [1469] LOG:  database system was shut down at
2007-07-20 21:53:26 CEST
2007-07-20 21:53:33 [1469] LOG:  checkpoint record is at A/FD833F0
2007-07-20 21:53:33 [1469] LOG:  redo record is at A/FD833F0; undo
record is at 0/0; shutdown TRUE
2007-07-20 21:53:33 [1469] LOG:  next transaction ID: 110977718; next
OID: 61913929
2007-07-20 21:53:33 [1469] LOG:  database system is ready

and Sean Reifschneider was logged in, so I suspect he did some
maintenance work.

Sean?

Regards,
Martin

From jafo at tummy.com  Sat Jul 21 08:17:20 2007
From: jafo at tummy.com (Sean Reifschneider)
Date: Sat, 21 Jul 2007 00:17:20 -0600
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific
	PyPI	index.
In-Reply-To: <46A1A219.60906@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<20070720200721.88E1D3A403A@sparrow.telecommunity.com>
	<46A1A219.60906@v.loewis.de>
Message-ID: <20070721061720.GB4489@tummy.com>

On Sat, Jul 21, 2007 at 08:05:13AM +0200, "Martin v. L?wis" wrote:
>Around that time, the Postgres log has these entries:

There was an upgrade of Postgres done earlier, as far as I can see,
pypi is running.  It must have been resolved earlier.  AMK mentioned there
was a problem with the upgrade restart and Apache had to be restarted, that
was like 6 hours ago though.

Thanks,
Sean
-- 
 "I not only use all the brains that I have, but all that I can borrow."
                 -- Woodrow Wilson
Sean Reifschneider, Member of Technical Staff <jafo at tummy.com>
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability


From martin at v.loewis.de  Sat Jul 21 19:00:30 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 21 Jul 2007 19:00:30 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
Message-ID: <46A23BAE.5090907@v.loewis.de>

> I've created and experimental prototype setuptools-specific package  
> index at
> 
>    http://download.zope.org/ppix

I've now added something similar as

http://cheeseshop.python.org/simple/

It differs from your site in a few ways:

- it does include a top-level index of all packages (but neither
  releases nor descriptions)
- it's always current, due to being dynamically computed
- it may differ in the precise list of URLs displayed;
  if there are important deviations, please let me know.

Regards,
Martin

From jim at zope.com  Sat Jul 21 19:12:48 2007
From: jim at zope.com (Jim Fulton)
Date: Sat, 21 Jul 2007 13:12:48 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A23BAE.5090907@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
Message-ID: <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>


On Jul 21, 2007, at 1:00 PM, Martin v. L?wis wrote:

>> I've created and experimental prototype setuptools-specific package
>> index at
>>
>>    http://download.zope.org/ppix
>
> I've now added something similar as
>
> http://cheeseshop.python.org/simple/

Way cool!

>
> It differs from your site in a few ways:
>
> - it does include a top-level index of all packages (but neither
>   releases nor descriptions)

Why?  This is a relatively expensive page, due to it's size I assume,  
that really provides no value.  This will slow down setuptools.

> - it's always current, due to being dynamically computed

And also unreliable, for the same reason. For example, it would have  
been inaccessible yesterday afternoon. And also puts more load on the  
server.  It would be much better imo if static pages could be written  
on writes.

> - it may differ in the precise list of URLs displayed;
>   if there are important deviations, please let me know.

The download and homepage URL anchors need rel="download" or  
rel="homepage".

They lack the #egg= links.

Compare your page for setuptools to mine.

Also, some packages use their pypi pages as their home page links.   
You want to exclude these, otherwise, setuptools will circle around  
to the human interface, which defeats point of the simple interface.

Thanks for plugging away on this.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From pje at telecommunity.com  Sat Jul 21 19:48:16 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat, 21 Jul 2007 13:48:16 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A23BAE.5090907@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
Message-ID: <20070721174558.DDF923A403A@sparrow.telecommunity.com>

At 07:00 PM 7/21/2007 +0200, Martin v. L?wis wrote:
> > I've created and experimental prototype setuptools-specific package
> > index at
> >
> >    http://download.zope.org/ppix
>
>I've now added something similar as
>
>http://cheeseshop.python.org/simple/

It's very fast, thanks.


>It differs from your site in a few ways:
>
>- it does include a top-level index of all packages (but neither
>   releases nor descriptions)

Unfortunately, that doesn't help current versions of setuptools.  See 
point #7 of:

http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api

Setuptools looks for release links, not package links on that page.

Compare:

$ easy_install -vvvi http://cheeseshop.python.org/simple Pywin32
Searching for Pywin32
Reading http://cheeseshop.python.org/simple/Pywin32/
Couldn't find index page for 'Pywin32' (maybe misspelled?)
Scanning index of all packages (this may take a while)
Reading http://cheeseshop.python.org/simple/
No local packages or download links found for Pywin32
error: Could not find suitable distribution for Requirement.parse('Pywin32')

$ easy_install -vvvi http://cheeseshop.python.org/pypi Pywin32
Searching for Pywin32
Reading http://cheeseshop.python.org/pypi/Pywin32/
Couldn't find index page for 'Pywin32' (maybe misspelled?)
Scanning index of all packages (this may take a while)
Reading http://cheeseshop.python.org/pypi/
Reading http://cheeseshop.python.org/pypi/pywin32/210
Reading http://sf.net/projects/pywin32
...


>- it's always current, due to being dynamically computed
>- it may differ in the precise list of URLs displayed;
>   if there are important deviations, please let me know.

Jim's already mentioned these, but the rel="" info (per the index API 
spec's point #6), and the links embedded in the long_description 
field (per point #4) are missing.  Without these, easy_install can't 
find sourceforge links, subversion checkouts, or any other embedded 
direct download links.  For example:

$ easy_install -vvvi http://cheeseshop.python.org/simple pywin32
Searching for pywin32
Reading http://cheeseshop.python.org/simple/pywin32/
No local packages or download links found for pywin32
error: Could not find suitable distribution for Requirement.parse('pywin32')

$ easy_install -vvvi http://cheeseshop.python.org/pypi pywin32
Searching for pywin32
Reading http://cheeseshop.python.org/pypi/pywin32/
Reading http://sf.net/projects/pywin32
Reading http://sourceforge.net/project/showfiles.php?group_id=78018
Found link: 
http://downloads.sourceforge.net/pywin32/pywin32-210.win32-py2.2.exe?modtime=1159009204&amp;big_mirror=0
...[a dozen more links]

$ easy_install -i http://cheeseshop.python.org/simple setuptools==dev
Searching for setuptools==dev
Reading http://cheeseshop.python.org/simple/setuptools/
No local packages or download links found for setuptools==dev
error: Could not find suitable distribution for 
Requirement.parse('setuptools==dev')

$ easy_install -i http://cheeseshop.python.org/pypi setuptools==dev
Searching for setuptools==dev
Reading http://cheeseshop.python.org/pypi/setuptools/
Reading http://cheeseshop.python.org/pypi/setuptools
Reading http://cheeseshop.python.org/pypi/setuptools/0.6c6
Best match: setuptools dev
Downloading 
http://svn.python.org/projects/sandbox/trunk/setuptools/#egg=setuptools-dev
Doing subversion checkout from 
http://svn.python.org/projects/sandbox/trunk/setuptools/ to ...


From martin at v.loewis.de  Sat Jul 21 21:08:52 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 21 Jul 2007 21:08:52 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
Message-ID: <46A259C4.6090605@v.loewis.de>

>> - it does include a top-level index of all packages (but neither
>>   releases nor descriptions)
> 
> Why?  This is a relatively expensive page, due to it's size I assume,
> that really provides no value.  This will slow down setuptools.

IIUC, it won't slow down setuptools, as setuptools looks at it only
if it cannot find the real package page due to a misspelling. So
as long as everything is spelled correctly, it should not provide
any slowdown.

If people do misspell a package name when invoking easy_install,
they get the feature that you consider of no value.

As for performance - 30 downloads take 3.9s currently from nearby.

>> - it's always current, due to being dynamically computed
> 
> And also unreliable, for the same reason. For example, it would have
> been inaccessible yesterday afternoon.

The same could happen to Apache, too, of course. svn.python.org
sometimes fails to restart when a restart is request on log rotation.

Any software is unreliable; to reduce downtime, you need an operator
that is available when something breaks.

> And also puts more load on the server.  It would be much better imo
> if static pages could be written on writes.

Contributions are welcome. In addition to me considering it futile,
I also don't know how to implement it correctly.

>> - it may differ in the precise list of URLs displayed;
>>   if there are important deviations, please let me know.
> 
> The download and homepage URL anchors need rel="download" or
> rel="homepage".

Done.

> They lack the #egg= links.

How are these computed?

> Also, some packages use their pypi pages as their home page links.

Ok, done.

Regards,
Martin

From martin at v.loewis.de  Sat Jul 21 21:23:30 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 21 Jul 2007 21:23:30 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <20070721174558.DDF923A403A@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<20070721174558.DDF923A403A@sparrow.telecommunity.com>
Message-ID: <46A25D32.4080606@v.loewis.de>

> Unfortunately, that doesn't help current versions of setuptools.  See
> point #7 of:
> 
> http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api
> 
> Setuptools looks for release links, not package links on that page.

I don't understand. What's a "release link"? The links on the index
page *do* go to the "project's active version pages", as specified
(there aren't any numbered version pages)

Jim left out that page entirely - are you saying it is impossible
to provide such an index page with the page structure that Jim
proposed?

> $ easy_install -vvvi http://cheeseshop.python.org/simple Pywin32
> Searching for Pywin32
> Reading http://cheeseshop.python.org/simple/Pywin32/
> Couldn't find index page for 'Pywin32' (maybe misspelled?)
> Scanning index of all packages (this may take a while)
> Reading http://cheeseshop.python.org/simple/
> No local packages or download links found for Pywin32

I see that it doesn't work, but I cannot understand why.
On

http://cheeseshop.python.org/simple/

"pywin32" is clearly linked, so it should be able to resolve
the misspelling.

> Jim's already mentioned these, but the rel="" info (per the index API
> spec's point #6),

This is fixed.

> and the links embedded in the long_description field
> (per point #4) are missing.

I have to think about this more. Is it correct that you want all href
attributes of all a elements in the long_description? And how do you
know what the long_description is from just looking at the rendered
page?

Regards,
Martin

From pje at telecommunity.com  Sat Jul 21 21:51:26 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat, 21 Jul 2007 15:51:26 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A25D32.4080606@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<20070721174558.DDF923A403A@sparrow.telecommunity.com>
	<46A25D32.4080606@v.loewis.de>
Message-ID: <20070721194908.F16373A403A@sparrow.telecommunity.com>

At 09:23 PM 7/21/2007 +0200, Martin v. L?wis wrote:
> > Unfortunately, that doesn't help current versions of setuptools.  See
> > point #7 of:
> >
> > http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api
> >
> > Setuptools looks for release links, not package links on that page.
>
>I don't understand. What's a "release link"? The links on the index
>page *do* go to the "project's active version pages", as specified
>(there aren't any numbered version pages)

See point #2:

"""2. Individual project version pages' URLs must be of the form 
base/projectname/version, where base is the package index's base URL."""

That's what's meant by "version pages" in point #7 -- i.e., they 
*must* be of that two-part form for setuptools to recognize them as such.


>I see that it doesn't work, but I cannot understand why.
>On
>
>http://cheeseshop.python.org/simple/
>
>"pywin32" is clearly linked, so it should be able to resolve
>the misspelling.

It could perhaps be *changed* to do so, but at present it follows the 
spec's definition of "version page" URLs.


> > Jim's already mentioned these, but the rel="" info (per the index API
> > spec's point #6),
>
>This is fixed.

Great; Sourceforge and other offsite download pages work now.


> > and the links embedded in the long_description field
> > (per point #4) are missing.
>
>I have to think about this more. Is it correct that you want all href
>attributes of all a elements in the long_description?

Yes; of course, the usual rendering needs to be applied, since 
long_description can contain reStructuredText.


>  And how do you
>know what the long_description is from just looking at the rendered
>page?

You don't need to; easy_install discovers those links the same way it 
does any other Cheeseshop-provided download links.  From 
easy_install's point of view, the entire page is just one big mass of 
links that might point to downloads:

"""4. ...It is explicitly permitted for a project's 
"long_description" to include URLs, and these should be formatted as 
HTML links by the package index, as EasyInstall does *no special 
processing* [emph. added] to identify what parts of a page are 
index-specific and which are part of the project's supplied description."""

In other words, the *only* links that are specially handled are the 
"rel" ones, which it follows unconditionally to look for additional 
direct download links.  All other links are merely *inspected* to see 
if they obviously refer to a downloadable package (e.g. .tgz, .zip, 
.egg, .exe etc., or explicitly-marked #egg).  As a side-effect, this 
means that links to perform Cheeseshop operations, links to other 
parts of python.org, etc. are simply ignored, as they are not links 
to downloadables nor marked as #egg.

If a URL can be determined by inspection to be a download link, then 
easy_install extracts version and platform info from the URL and adds 
it as a candidate for download selection.  When both the home page 
and download URL have been read, along with any detected "active 
version pages" (as defined above), then easy_install chooses the 
"best" download URL from all the candidates it has seen up to that point.


From pje at telecommunity.com  Sat Jul 21 21:53:40 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat, 21 Jul 2007 15:53:40 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
 index.
In-Reply-To: <f7tmf3$nal$1@sea.gmane.org>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de> <f7tmf3$nal$1@sea.gmane.org>
Message-ID: <20070721195122.CF2343A40D7@sparrow.telecommunity.com>

At 09:23 PM 7/21/2007 +0200, Georg Brandl wrote:
>What I, as an outsider, can see: for the Pygments package, Jim's page
>lists the development link from the package description
>(http://trac.pocoo.org/repos/pygments/trunk#egg=Pygments-dev), but
>it looks like it's badly extracted (it has a trailing ">`__"), yours
>doesn't list it at all.

Hm, perhaps Jim is extracting it by looking for #egg URLs, rather 
than by actually processing the reST markup with docutils.  That 
should probably be fixed, since there are many ways to specify URLs 
in reST and handling them all with regular expressions is unlikely to 
work as well as applying regular expressions to the resulting HTML.  :)

(Also, looking only for #egg links will miss non-#egg links embedded 
in the long_description, in the event that someone places direct 
download links there.) 


From martin at v.loewis.de  Sun Jul 22 00:53:03 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 22 Jul 2007 00:53:03 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <20070721194908.F16373A403A@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<20070721174558.DDF923A403A@sparrow.telecommunity.com>
	<46A25D32.4080606@v.loewis.de>
	<20070721194908.F16373A403A@sparrow.telecommunity.com>
Message-ID: <46A28E4F.5070905@v.loewis.de>

> See point #2:
> 
> """2. Individual project version pages' URLs must be of the form
> base/projectname/version, where base is the package index's base URL."""
> 
> That's what's meant by "version pages" in point #7 -- i.e., they *must*
> be of that two-part form for setuptools to recognize them as such.

Ok, but I still cannot see how to fix that: there simply *is* no
version part that I could point to.

Does that mean that Jim's approach does not work?

> Yes; of course, the usual rendering needs to be applied, since
> long_description can contain reStructuredText.

Ok, I now added these links as well.

Regards,
Martin


From pje at telecommunity.com  Sun Jul 22 01:20:04 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat, 21 Jul 2007 19:20:04 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A28E4F.5070905@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<20070721174558.DDF923A403A@sparrow.telecommunity.com>
	<46A25D32.4080606@v.loewis.de>
	<20070721194908.F16373A403A@sparrow.telecommunity.com>
	<46A28E4F.5070905@v.loewis.de>
Message-ID: <20070721231808.2D5793A403A@sparrow.telecommunity.com>

At 12:53 AM 7/22/2007 +0200, Martin v. L?wis wrote:
> > See point #2:
> >
> > """2. Individual project version pages' URLs must be of the form
> > base/projectname/version, where base is the package index's base URL."""
> >
> > That's what's meant by "version pages" in point #7 -- i.e., they *must*
> > be of that two-part form for setuptools to recognize them as such.
>
>Ok, but I still cannot see how to fix that: there simply *is* no
>version part that I could point to.

Actually, 'version' is allowed to be an empty string, so simply 
adding a trailing '/' to the links you're generating now should work.

The only thing the version part of a version page URL is used for, is 
to handle links to .py files: setuptools uses the package version (if 
available) to synthesize a setup.py for installing standalone .py files.

If the version is not available, it won't be able to do that, but 
that's a relatively minor feature, all things considered.  Few 
packages are distributed via a single .py download URL, but the 
package index could actually tack on an #egg designator to such links 
in order to preserve 100% backward-compatibility.


>Does that mean that Jim's approach does not work?

Jim isn't providing the top-level index, and thus doesn't provide 
punctuation or case corrections.  The "version pages" convention is 
only used by setuptools to discover additional index pages for 
crawling, anyway, and his whole design is intended to prevent crawling.


> > Yes; of course, the usual rendering needs to be applied, since
> > long_description can contain reStructuredText.
>
>Ok, I now added these links as well.

Looks good, thanks!


From martin at v.loewis.de  Sun Jul 22 09:42:19 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 22 Jul 2007 09:42:19 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <20070721231808.2D5793A403A@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<20070721174558.DDF923A403A@sparrow.telecommunity.com>
	<46A25D32.4080606@v.loewis.de>
	<20070721194908.F16373A403A@sparrow.telecommunity.com>
	<46A28E4F.5070905@v.loewis.de>
	<20070721231808.2D5793A403A@sparrow.telecommunity.com>
Message-ID: <46A30A5B.4020007@v.loewis.de>

> Actually, 'version' is allowed to be an empty string, so simply adding a
> trailing '/' to the links you're generating now should work.

It does indeed.

Regards,
Martin

From jim at zope.com  Sun Jul 22 15:09:44 2007
From: jim at zope.com (Jim Fulton)
Date: Sun, 22 Jul 2007 09:09:44 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A259C4.6090605@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
Message-ID: <FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>


On Jul 21, 2007, at 3:08 PM, Martin v. L?wis wrote:

>>> - it does include a top-level index of all packages (but neither
>>>   releases nor descriptions)
>>
>> Why?  This is a relatively expensive page, due to it's size I assume,
>> that really provides no value.  This will slow down setuptools.
>
> IIUC, it won't slow down setuptools, as setuptools looks at it only
> if it cannot find the real package page due to a misspelling. So
> as long as everything is spelled correctly, it should not provide
> any slowdown.
>
> If people do misspell a package name when invoking easy_install,
> they get the feature that you consider of no value.

That is not correct. Not all packages are in PyPI.  Using a package  
that isn't in PyPI will trigger a fetch of that page.  It isn't  
misspelled, it's just not there.  People should *not* misspell pages  
when using setuptools.  They should certainly not use misspelled  
package names in requirements.  In my strongly help opinion, allowing  
imprecise names in requirements and setuptools command if of negative  
value.

> As for performance - 30 downloads take 3.9s currently from nearby.

That's nice.  For me, that page takes 3 or 4 times as long as other  
pages.

>>> - it's always current, due to being dynamically computed
>>
>> And also unreliable, for the same reason. For example, it would have
>> been inaccessible yesterday afternoon.
>
> The same could happen to Apache, too, of course. svn.python.org
> sometimes fails to restart when a restart is request on log rotation.
>
> Any software is unreliable; to reduce downtime, you need an operator
> that is available when something breaks.

Apache has a far better record than the cheeseshop.  I give up.

>> And also puts more load on the server.  It would be much better imo
>> if static pages could be written on writes.
>
> Contributions are welcome. In addition to me considering it futile,
> I also don't know how to implement it correctly.

I'd be happy to contribute my polling version.  That solves my  
problems and I can't justify the additional effort to figure out the  
cheeseshop softtware.

...
>> They lack the #egg= links.
>
> How are these computed?

By parsing the description.

Apparently, I'm going this incorrectly.  I'll have to look into that.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Sun Jul 22 15:16:44 2007
From: jim at zope.com (Jim Fulton)
Date: Sun, 22 Jul 2007 09:16:44 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <20070721195122.CF2343A40D7@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de> <f7tmf3$nal$1@sea.gmane.org>
	<20070721195122.CF2343A40D7@sparrow.telecommunity.com>
Message-ID: <CF58706D-4856-4B48-9FCB-549FC9A82797@zope.com>


On Jul 21, 2007, at 3:53 PM, Phillip J. Eby wrote:

> At 09:23 PM 7/21/2007 +0200, Georg Brandl wrote:
>> What I, as an outsider, can see: for the Pygments package, Jim's page
>> lists the development link from the package description
>> (http://trac.pocoo.org/repos/pygments/trunk#egg=Pygments-dev), but
>> it looks like it's badly extracted (it has a trailing ">`__"), yours
>> doesn't list it at all.
>
> Hm, perhaps Jim is extracting it by looking for #egg URLs, rather  
> than by actually processing the reST markup with docutils.

Yup.

> That should probably be fixed, since there are many ways to specify  
> URLs in reST and handling them all with regular expressions is  
> unlikely to work

Yeah, I was hoping to get off easy. :)

> as well as applying regular expressions to the resulting HTML.  :)

:)

> (Also, looking only for #egg links will miss non-#egg links  
> embedded in the long_description, in the event that someone places  
> direct download links there.)

By this, I assume you mean direct links to distributions.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Sun Jul 22 15:19:05 2007
From: jim at zope.com (Jim Fulton)
Date: Sun, 22 Jul 2007 09:19:05 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <20070721231808.2D5793A403A@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<20070721174558.DDF923A403A@sparrow.telecommunity.com>
	<46A25D32.4080606@v.loewis.de>
	<20070721194908.F16373A403A@sparrow.telecommunity.com>
	<46A28E4F.5070905@v.loewis.de>
	<20070721231808.2D5793A403A@sparrow.telecommunity.com>
Message-ID: <C1301599-987E-42B9-AA02-550BFC3D37CC@zope.com>


On Jul 21, 2007, at 7:20 PM, Phillip J. Eby wrote:
...
> Jim isn't providing the top-level index, and thus doesn't provide  
> punctuation or case corrections.

Yup

> The "version pages" convention is only used by setuptools to  
> discover additional index pages for crawling, anyway, and his whole  
> design is intended to prevent crawling.

That's a secondary benefit. The main goal is to avoid the expense of  
that page for packages that aren't in PyPI, as some packages I use  
aren't.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From martin at v.loewis.de  Sun Jul 22 18:24:41 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 22 Jul 2007 18:24:41 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
Message-ID: <46A384C9.8040404@v.loewis.de>

>> If people do misspell a package name when invoking easy_install,
>> they get the feature that you consider of no value.
> 
> That is not correct. Not all packages are in PyPI.  Using a package that
> isn't in PyPI will trigger a fetch of that page.

I don't understand. What page is fetched if the package is not in PyPI?

> It isn't misspelled,
> it's just not there.  People should *not* misspell pages when using
> setuptools.  They should certainly not use misspelled package names in
> requirements.  In my strongly help opinion, allowing imprecise names in
> requirements and setuptools command if of negative value.

I cannot comment on. I don't use setuptools, and have no intuition what
is good or bad when using it (for example, I consider .egg files and
the notion of eggs inherently bad).

My main motivation to provide that page is that the setuptools
specification says it should be there. As this entire infrastructure
is for the sake of setuptools, I find it pointless to not support
setuptools fully.

> I'd be happy to contribute my polling version.  That solves my problems
> and I can't justify the additional effort to figure out the cheeseshop
> softtware.

I'd like to hear other opinions here. Would people prefer if the index
was always correct (and perhaps somewhat slow), or would they prefer
instead that it is super-efficient (and somewhat out-of-date)?

Regards,
Martin

From martin at v.loewis.de  Sun Jul 22 18:26:14 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 22 Jul 2007 18:26:14 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <C1301599-987E-42B9-AA02-550BFC3D37CC@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<20070721174558.DDF923A403A@sparrow.telecommunity.com>
	<46A25D32.4080606@v.loewis.de>
	<20070721194908.F16373A403A@sparrow.telecommunity.com>
	<46A28E4F.5070905@v.loewis.de>
	<20070721231808.2D5793A403A@sparrow.telecommunity.com>
	<C1301599-987E-42B9-AA02-550BFC3D37CC@zope.com>
Message-ID: <46A38526.2010308@v.loewis.de>

> That's a secondary benefit. The main goal is to avoid the expense of
> that page for packages that aren't in PyPI, as some packages I use aren't.

I see. Shouldn't that be fixed by providing an option to setuptools
that avoids going to the index for missing packages?

Regards,
Martin

From tseaver at palladion.com  Sun Jul 22 18:33:11 2007
From: tseaver at palladion.com (Tres Seaver)
Date: Sun, 22 Jul 2007 12:33:11 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI   index.
In-Reply-To: <46A384C9.8040404@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	<46A23BAE.5090907@v.loewis.de>	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>	<46A259C4.6090605@v.loewis.de>	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<46A384C9.8040404@v.loewis.de>
Message-ID: <46A386C7.5080203@palladion.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin v. L?wis wrote:
>>> If people do misspell a package name when invoking easy_install,
>>> they get the feature that you consider of no value.
>> That is not correct. Not all packages are in PyPI.  Using a package that
>> isn't in PyPI will trigger a fetch of that page.
> 
> I don't understand. What page is fetched if the package is not in PyPI?

I think Jim was referring to a package which is *registered* in PyPI,
but whose download location was elsewhere.

<snip>

>> I'd be happy to contribute my polling version.  That solves my problems
>> and I can't justify the additional effort to figure out the cheeseshop
>> softtware.
> 
> I'd like to hear other opinions here. Would people prefer if the index
> was always correct (and perhaps somewhat slow), or would they prefer
> instead that it is super-efficient (and somewhat out-of-date)?

I would prefer the second, particularly as I think the caching solution
lends itself to mirroring, which would also improve availability.

- From my complete ignorance of the underlying architecture:  the polling
solution would stay pretty current if there were an extremely cheap way
to ask for the latest "transaction ID" on the cheeseshop, or if the
query could fetch only registrations newer than the last poll time.  Are
such queries possible over the XML-RPC interface?


Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGo4bH+gerLs4ltQ4RAjiWAJ9/5TeOWAHdwL7PS5QAUnpyZWJzMQCeN5hT
5rRjOHzAu4cf+TKktNntWV8=
=p59N
-----END PGP SIGNATURE-----

From tseaver at palladion.com  Sun Jul 22 18:33:11 2007
From: tseaver at palladion.com (Tres Seaver)
Date: Sun, 22 Jul 2007 12:33:11 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI   index.
In-Reply-To: <46A384C9.8040404@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	<46A23BAE.5090907@v.loewis.de>	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>	<46A259C4.6090605@v.loewis.de>	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<46A384C9.8040404@v.loewis.de>
Message-ID: <46A386C7.5080203@palladion.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin v. L?wis wrote:
>>> If people do misspell a package name when invoking easy_install,
>>> they get the feature that you consider of no value.
>> That is not correct. Not all packages are in PyPI.  Using a package that
>> isn't in PyPI will trigger a fetch of that page.
> 
> I don't understand. What page is fetched if the package is not in PyPI?

I think Jim was referring to a package which is *registered* in PyPI,
but whose download location was elsewhere.

<snip>

>> I'd be happy to contribute my polling version.  That solves my problems
>> and I can't justify the additional effort to figure out the cheeseshop
>> softtware.
> 
> I'd like to hear other opinions here. Would people prefer if the index
> was always correct (and perhaps somewhat slow), or would they prefer
> instead that it is super-efficient (and somewhat out-of-date)?

I would prefer the second, particularly as I think the caching solution
lends itself to mirroring, which would also improve availability.

- From my complete ignorance of the underlying architecture:  the polling
solution would stay pretty current if there were an extremely cheap way
to ask for the latest "transaction ID" on the cheeseshop, or if the
query could fetch only registrations newer than the last poll time.  Are
such queries possible over the XML-RPC interface?


Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGo4bH+gerLs4ltQ4RAjiWAJ9/5TeOWAHdwL7PS5QAUnpyZWJzMQCeN5hT
5rRjOHzAu4cf+TKktNntWV8=
=p59N
-----END PGP SIGNATURE-----


From pje at telecommunity.com  Sun Jul 22 18:40:11 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 22 Jul 2007 12:40:11 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A38526.2010308@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<20070721174558.DDF923A403A@sparrow.telecommunity.com>
	<46A25D32.4080606@v.loewis.de>
	<20070721194908.F16373A403A@sparrow.telecommunity.com>
	<46A28E4F.5070905@v.loewis.de>
	<20070721231808.2D5793A403A@sparrow.telecommunity.com>
	<C1301599-987E-42B9-AA02-550BFC3D37CC@zope.com>
	<46A38526.2010308@v.loewis.de>
Message-ID: <20070722163754.A78EF3A40A9@sparrow.telecommunity.com>

At 06:26 PM 7/22/2007 +0200, Martin v. L?wis wrote:
> > That's a secondary benefit. The main goal is to avoid the expense of
> > that page for packages that aren't in PyPI, as some packages I use aren't.
>
>I see. Shouldn't that be fixed by providing an option to setuptools
>that avoids going to the index for missing packages?

There's already such an option; --find-links or -f lets you specify 
URLs that should be checked before *any* PyPI access occurs.  If all 
dependencies can be met using those URLs without going to PyPI, and 
you haven't explicitly requested -U (--update), easy_install doesn't 
go to PyPI.

You can also specify such links in a setup script using 
setup(dependency_links=[...]), which bakes them into the .egg.  When 
searching for that egg's dependencies, easy_install will pick them up 
and use them.

So, it's actually possible to install a package and all its 
dependencies without using PyPI at all, if the package author(s) bake 
the URLs in.


From jim at zope.com  Sun Jul 22 18:38:09 2007
From: jim at zope.com (Jim Fulton)
Date: Sun, 22 Jul 2007 12:38:09 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A384C9.8040404@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<46A384C9.8040404@v.loewis.de>
Message-ID: <E31628FD-9208-4E22-88FF-BAAB86956994@zope.com>


On Jul 22, 2007, at 12:24 PM, Martin v. L?wis wrote:

>>> If people do misspell a package name when invoking easy_install,
>>> they get the feature that you consider of no value.
>>
>> That is not correct. Not all packages are in PyPI.  Using a  
>> package that
>> isn't in PyPI will trigger a fetch of that page.
>
> I don't understand. What page is fetched if the package is not in  
> PyPI?

We have lots of packages that aren't in PyPI.  Some of them aren't  
ready for PyPI or are not of general interest. Some are proprietary.


>> It isn't misspelled,
>> it's just not there.  People should *not* misspell pages when using
>> setuptools.  They should certainly not use misspelled package  
>> names in
>> requirements.  In my strongly help opinion, allowing imprecise  
>> names in
>> requirements and setuptools command if of negative value.
>
> I cannot comment on. I don't use setuptools, and have no intuition  
> what
> is good or bad when using it (for example, I consider .egg files and
> the notion of eggs inherently bad).
>
> My main motivation to provide that page is that the setuptools
> specification says it should be there. As this entire infrastructure
> is for the sake of setuptools, I find it pointless to not support
> setuptools fully.

Fair enough. Theory beats practicality every time. ;)


>> I'd be happy to contribute my polling version.  That solves my  
>> problems
>> and I can't justify the additional effort to figure out the  
>> cheeseshop
>> softtware.
>
> I'd like to hear other opinions here.

Yes. This has been a fairly limited discussion. Sigh.

> Would people prefer if the index
> was always correct (and perhaps somewhat slow), or would they prefer
> instead that it is super-efficient (and somewhat out-of-date)?

Where somewhat out of date could be a matter of seconds.  IMO, a  
python.org index could poll every few seconds, given that local  
polling only takes a few milliseconds.  I have a feeling that this  
discussion is going to annoy someone with PyPI software knowledge  
enough to add baking on write.  :) For example, I had the impression  
that Rene' was planning to invoke scripts after updates.  It would be  
easy to invoke my polling script or a script based on your work,

BTW, I'm pretty sure that geographic mirrors are desirable, both for  
performance and redundancy reasons.  I think that, for these, polling  
once a minute is plenty and puts negligible load on PyPI, assuming  
that there aren't hundreds of them.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Sun Jul 22 18:41:55 2007
From: jim at zope.com (Jim Fulton)
Date: Sun, 22 Jul 2007 12:41:55 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI   index.
In-Reply-To: <46A386C7.5080203@palladion.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	<46A23BAE.5090907@v.loewis.de>	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>	<46A259C4.6090605@v.loewis.de>	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<46A384C9.8040404@v.loewis.de> <46A386C7.5080203@palladion.com>
Message-ID: <437D4304-ECF3-4240-8C33-F946128F8232@zope.com>


On Jul 22, 2007, at 12:33 PM, Tres Seaver wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Martin v. L?wis wrote:
>>>> If people do misspell a package name when invoking easy_install,
>>>> they get the feature that you consider of no value.
>>> That is not correct. Not all packages are in PyPI.  Using a  
>>> package that
>>> isn't in PyPI will trigger a fetch of that page.
>>
>> I don't understand. What page is fetched if the package is not in  
>> PyPI?
>
> I think Jim was referring to a package which is *registered* in PyPI,
> but whose download location was elsewhere.

No, I was referring to packages that aren't ready for or of interest  
to PyPI or to proprietary packages.

...

> - From my complete ignorance of the underlying architecture:  the  
> polling
> solution would stay pretty current if there were an extremely cheap  
> way
> to ask for the latest "transaction ID" on the cheeseshop, or if the
> query could fetch only registrations newer than the last poll time.

There is such an API thanks to Martin.

>   Are
> such queries possible over the XML-RPC interface?

Yup. I'm using them.  Queries take only a few milliseconds per  
request on the server.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From pje at telecommunity.com  Sun Jul 22 18:51:40 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 22 Jul 2007 12:51:40 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
 index.
In-Reply-To: <FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
Message-ID: <20070722164922.AE50D3A40A9@sparrow.telecommunity.com>

At 09:09 AM 7/22/2007 -0400, Jim Fulton wrote:
>People should *not* misspell pages
>when using setuptools.  They should certainly not use misspelled
>package names in requirements.

People do all sorts of things they shouldn't.  That doesn't stop them 
blaming other people for their mistakes.

It's said that a 10% improvement in ease-of-use can double a 
product's users.  Case sensitivity is a barrier to entry for new 
users, and setuptools can't afford any additional entry barriers.

A significant part of setuptools' audience includes people who are 
new to Python, or at least new to installing or distributing Python 
modules, and quite a lot of setuptools features are aimed squarely at 
that audience.  This happens to be one of them.


>   In my strongly help opinion, allowing
>imprecise names in requirements and setuptools command if of negative
>value.

I understand that perspective.  But practicality beats purity, and 
this is absolutely a "worse is better" type of situation.

Setuptools has lots of features that are targeted at different 
audiences.  There are plenty of features targeted at the group you're 
in, don't begrudge the other groups their features.  :)

(This is probably one reason that setuptools is so controversial; 
everybody can find *something* about it to hate, even if those very 
same things are quite loved by a different group of users.  E.g. you 
and case-insensitivity, Martin and eggs, etc.)


From martin at v.loewis.de  Sun Jul 22 18:54:36 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 22 Jul 2007 18:54:36 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI   index.
In-Reply-To: <46A386C7.5080203@palladion.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	<46A23BAE.5090907@v.loewis.de>	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>	<46A259C4.6090605@v.loewis.de>	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<46A384C9.8040404@v.loewis.de> <46A386C7.5080203@palladion.com>
Message-ID: <46A38BCC.1000707@v.loewis.de>

> I would prefer the second, particularly as I think the caching solution
> lends itself to mirroring, which would also improve availability.

I think this conclusion is wrong: Jim already has a mirror
infrastructure that anybody can run, without the need of running that
on the central server.

> - From my complete ignorance of the underlying architecture:  the polling
> solution would stay pretty current if there were an extremely cheap way
> to ask for the latest "transaction ID" on the cheeseshop, or if the
> query could fetch only registrations newer than the last poll time.  Are
> such queries possible over the XML-RPC interface?

Yes; you can ask for all changes since a certain UTC time. People
shouldn't invoke that every UTC second, though - once a minute is
fine.

Regards,
Martin

From martin at v.loewis.de  Sun Jul 22 19:03:49 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 22 Jul 2007 19:03:49 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <E31628FD-9208-4E22-88FF-BAAB86956994@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<46A384C9.8040404@v.loewis.de>
	<E31628FD-9208-4E22-88FF-BAAB86956994@zope.com>
Message-ID: <46A38DF5.6010701@v.loewis.de>

Jim Fulton schrieb:
> On Jul 22, 2007, at 12:24 PM, Martin v. L?wis wrote:
>>>> If people do misspell a package name when invoking easy_install,
>>>> they get the feature that you consider of no value.
>>>
>>> That is not correct. Not all packages are in PyPI.  Using a package that
>>> isn't in PyPI will trigger a fetch of that page.
>>
>> I don't understand. What page is fetched if the package is not in PyPI?
> 
> We have lots of packages that aren't in PyPI.  Some of them aren't ready
> for PyPI or are not of general interest. Some are proprietary.

Ah, ok. So I stand to my original statement (the one you classified
as incorrect): *If* I do misspell a package name, *then* setuptools
will correct the spelling if the index page is available.

>> Would people prefer if the index
>> was always correct (and perhaps somewhat slow), or would they prefer
>> instead that it is super-efficient (and somewhat out-of-date)?
> 
> Where somewhat out of date could be a matter of seconds.

And where somewhat slower could be "practically not noticable".

> BTW, I'm pretty sure that geographic mirrors are desirable, both for
> performance and redundancy reasons.  I think that, for these, polling
> once a minute is plenty and puts negligible load on PyPI, assuming that
> there aren't hundreds of them.

Sure: I don't mind at all if more people run your software on their
machines. If people want it more official, we can have
"cheeseshop0.python.org", "cheeseshop1.python.org", and so on,
or "de.cheeseshop.python.org", "jp.cheeseshop.python.org", and so
on.

As I said before: if people also want to mirror the files, I'd
ask them provide download statistics. Given the changelog, it
would be easy to keep a file mirror up-to-date (of course,
if a mirror downloads all files, these downloads also count
towards the download statistics - which might confuse people).

Regards,
Martin

From martin at v.loewis.de  Sun Jul 22 20:40:05 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 22 Jul 2007 20:40:05 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
Message-ID: <46A3A485.7060602@v.loewis.de>

> WRT zc.buildout, refreshing a buildout with just ZODB installed in it  
> takes about 45 seconds for me using PyPI and about 5 seconds using  
> the experimental index.

Can you kindly provide a measurement for the index at
http://cheeseshop.python.org/simple/ as well?

Thanks,
Martin

From fdrake at gmail.com  Mon Jul 23 06:56:48 2007
From: fdrake at gmail.com (Fred Drake)
Date: Mon, 23 Jul 2007 00:56:48 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
Message-ID: <9cee7ab80707222156o2bae8a32pdaf7767f8c167918@mail.gmail.com>

On 7/22/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> Setuptools has lots of features that are targeted at different
> audiences.  There are plenty of features targeted at the group you're
> in, don't begrudge the other groups their features.  :)

Actually, I suspect this is a substantial contributor to setuptools
being considered controversial: it encompasses to many different
features.  That certainly keeps me feeling unhappy about depending on
it.


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at gmail.com>
"Chaos is the score upon which reality is written." --Henry Miller

From jim at zope.com  Mon Jul 23 12:59:44 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 23 Jul 2007 06:59:44 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A38DF5.6010701@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<46A384C9.8040404@v.loewis.de>
	<E31628FD-9208-4E22-88FF-BAAB86956994@zope.com>
	<46A38DF5.6010701@v.loewis.de>
Message-ID: <CC944592-8FA3-4010-A4AB-7F2CADF8153D@zope.com>


On Jul 22, 2007, at 1:03 PM, Martin v. L?wis wrote:

> Jim Fulton schrieb:
>> On Jul 22, 2007, at 12:24 PM, Martin v. L?wis wrote:
>>>>> If people do misspell a package name when invoking easy_install,
>>>>> they get the feature that you consider of no value.
>>>>
>>>> That is not correct. Not all packages are in PyPI.  Using a  
>>>> package that
>>>> isn't in PyPI will trigger a fetch of that page.
>>>
>>> I don't understand. What page is fetched if the package is not in  
>>> PyPI?
>>
>> We have lots of packages that aren't in PyPI.  Some of them aren't  
>> ready
>> for PyPI or are not of general interest. Some are proprietary.
>
> Ah, ok. So I stand to my original statement (the one you classified
> as incorrect): *If* I do misspell a package name, *then* setuptools
> will correct the spelling if the index page is available.

Your full original statement was:

On Jul 21, 2007, at 3:08 PM, Martin v. L?wis wrote:
> IIUC, it won't slow down setuptools, as setuptools looks at it only
> if it cannot find the real package page due to a misspelling. So
> as long as everything is spelled correctly, it should not provide
> any slowdown.
>
> If people do misspell a package name when invoking easy_install,
> they get the feature that you consider of no value.

I was referring to the part about not slowing things down when people  
didn't misspell.  But it looks like I was mistaken. It was my  
understanding that setuptools always checked index/ when it couldn't  
find index/package_name/, but as Phillip pointed out, if it finds a  
package via find links, it won't look at index/.  Basic tests seem to  
confirm this.

>>> Would people prefer if the index
>>> was always correct (and perhaps somewhat slow), or would they prefer
>>> instead that it is super-efficient (and somewhat out-of-date)?
>>
>> Where somewhat out of date could be a matter of seconds.
>
> And where somewhat slower could be "practically not noticable".

I wasn't arguing about speed.  I agree that when PyPI is working  
well, the difference between the speed of the dynamic page and the  
speed of a static page wouldn't be noticeable.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Mon Jul 23 13:08:50 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 23 Jul 2007 07:08:50 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
Message-ID: <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>


On Jul 22, 2007, at 12:51 PM, Phillip J. Eby wrote:

> At 09:09 AM 7/22/2007 -0400, Jim Fulton wrote:
>> People should *not* misspell pages
>> when using setuptools.  They should certainly not use misspelled
>> package names in requirements.
>
> People do all sorts of things they shouldn't.  That doesn't stop  
> them blaming other people for their mistakes.
>
> It's said that a 10% improvement in ease-of-use can double a  
> product's users.  Case sensitivity is a barrier to entry for new  
> users, and setuptools can't afford any additional entry barriers.

I totally don't buy this in a case like this.  People installing  
packages with setuptools are technical users.  We expect them to  
write Python scripts.


> A significant part of setuptools' audience includes people who are  
> new to Python, or at least new to installing or distributing Python  
> modules, and quite a lot of setuptools features are aimed squarely  
> at that audience.  This happens to be one of them.

I don't think that encouraging use of case insensitive names by  
people who are about start learning a language that uses case  
sensitive names is doing them any favors.


>>   In my strongly help opinion, allowing
>> imprecise names in requirements and setuptools command if of negative
>> value.
>
> I understand that perspective.  But practicality beats purity, and  
> this is absolutely a "worse is better" type of situation.

Obviously we disagree.

> Setuptools has lots of features that are targeted at different  
> audiences.  There are plenty of features targeted at the group  
> you're in, don't begrudge the other groups their features.  :)

I don't think you are helping them.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Mon Jul 23 13:36:45 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 23 Jul 2007 07:36:45 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A3A485.7060602@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A3A485.7060602@v.loewis.de>
Message-ID: <617C738B-BDB4-4EDE-900E-64B50EFC2ED6@zope.com>


On Jul 22, 2007, at 2:40 PM, Martin v. L?wis wrote:

>> WRT zc.buildout, refreshing a buildout with just ZODB installed in it
>> takes about 45 seconds for me using PyPI and about 5 seconds using
>> the experimental index.
>
> Can you kindly provide a measurement for the index at
> http://cheeseshop.python.org/simple/ as well?

Yup. So, ATM:

   Using old PyPI takes about 1m5s
   Using simple takes about       25s
   Using ppix  takes about            8s

Some notes:

- ZODB isn't the best example as it has download links to  
www.zope.org, making it take longer than packages without offsite  
links (relative to PyPI).

- I expect that the difference between simple and ppix *for me* is a  
matter of geography.

Refreshing an empty buildout checks the zc.buildout and setuptools  
packages.  For that:

Old PyPI takes 25s
Simple takes      8s
and ppix takes   .5s

Again, I assume that the difference between simple and ppix has more  
to do with geography than the difference between serving statically  
and dynamically. The simple page has more links on it than the ppix  
page, because I haven't gotten around to scarf all links off of a  
restructured-text rendering of long description.  I doubt that makes  
any difference.  It will be interesting to try again after I fix that.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From pje at telecommunity.com  Mon Jul 23 17:22:30 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 23 Jul 2007 11:22:30 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
 index.
In-Reply-To: <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
Message-ID: <20070723152015.E7AFA3A403D@sparrow.telecommunity.com>

At 07:08 AM 7/23/2007 -0400, Jim Fulton wrote:
>On Jul 22, 2007, at 12:51 PM, Phillip J. Eby wrote:
>>At 09:09 AM 7/22/2007 -0400, Jim Fulton wrote:
>>>People should *not* misspell pages
>>>when using setuptools.  They should certainly not use misspelled
>>>package names in requirements.
>>
>>People do all sorts of things they shouldn't.  That doesn't stop
>>them blaming other people for their mistakes.
>>
>>It's said that a 10% improvement in ease-of-use can double a
>>product's users.  Case sensitivity is a barrier to entry for new
>>users, and setuptools can't afford any additional entry barriers.
>
>I totally don't buy this in a case like this.  People installing
>packages with setuptools are technical users.  We expect them to
>write Python scripts.

No, "we" don't.  Eggs were created to support application-level 
plugins, such as are used by Trac and Chandler.  Trac and Chandler 
users are not necessarily programmers, let alone Python programmers.


From tseaver at palladion.com  Mon Jul 23 18:01:02 2007
From: tseaver at palladion.com (Tres Seaver)
Date: Mon, 23 Jul 2007 12:01:02 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	<46A23BAE.5090907@v.loewis.de>	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>	<46A259C4.6090605@v.loewis.de>	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
Message-ID: <46A4D0BE.4030706@palladion.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Phillip J. Eby wrote:
> At 07:08 AM 7/23/2007 -0400, Jim Fulton wrote:
>> On Jul 22, 2007, at 12:51 PM, Phillip J. Eby wrote:
>>> At 09:09 AM 7/22/2007 -0400, Jim Fulton wrote:
>>>> People should *not* misspell pages
>>>> when using setuptools.  They should certainly not use misspelled
>>>> package names in requirements.
>>> People do all sorts of things they shouldn't.  That doesn't stop
>>> them blaming other people for their mistakes.
>>>
>>> It's said that a 10% improvement in ease-of-use can double a
>>> product's users.  Case sensitivity is a barrier to entry for new
>>> users, and setuptools can't afford any additional entry barriers.
>> I totally don't buy this in a case like this.  People installing
>> packages with setuptools are technical users.  We expect them to
>> write Python scripts.
> 
> No, "we" don't.  Eggs were created to support application-level 
> plugins, such as are used by Trac and Chandler.  Trac and Chandler 
> users are not necessarily programmers, let alone Python programmers.

But by definition, the people typing the names of the dependencies into
a 'setup.py' for such a plugin *are* Python programmers, and could be
expected to know about case sensitivity.

I don't think Jim was areguing that human-centric *search* should punish
misspellings, but rather that encouraging such sloppiness in other
packages is a misfeature, especially if supporting it induces a tax on
*all* users of automated dependency resolution.


Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGpNC++gerLs4ltQ4RAr2HAJ9UdPIVdz36inTG7nkm8SnrWPpcOgCgjKPc
sOqbuwOhUvlsSYpgxFSz1mg=
=F1EY
-----END PGP SIGNATURE-----

From tseaver at palladion.com  Mon Jul 23 18:01:02 2007
From: tseaver at palladion.com (Tres Seaver)
Date: Mon, 23 Jul 2007 12:01:02 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	<46A23BAE.5090907@v.loewis.de>	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>	<46A259C4.6090605@v.loewis.de>	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
Message-ID: <46A4D0BE.4030706@palladion.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Phillip J. Eby wrote:
> At 07:08 AM 7/23/2007 -0400, Jim Fulton wrote:
>> On Jul 22, 2007, at 12:51 PM, Phillip J. Eby wrote:
>>> At 09:09 AM 7/22/2007 -0400, Jim Fulton wrote:
>>>> People should *not* misspell pages
>>>> when using setuptools.  They should certainly not use misspelled
>>>> package names in requirements.
>>> People do all sorts of things they shouldn't.  That doesn't stop
>>> them blaming other people for their mistakes.
>>>
>>> It's said that a 10% improvement in ease-of-use can double a
>>> product's users.  Case sensitivity is a barrier to entry for new
>>> users, and setuptools can't afford any additional entry barriers.
>> I totally don't buy this in a case like this.  People installing
>> packages with setuptools are technical users.  We expect them to
>> write Python scripts.
> 
> No, "we" don't.  Eggs were created to support application-level 
> plugins, such as are used by Trac and Chandler.  Trac and Chandler 
> users are not necessarily programmers, let alone Python programmers.

But by definition, the people typing the names of the dependencies into
a 'setup.py' for such a plugin *are* Python programmers, and could be
expected to know about case sensitivity.

I don't think Jim was areguing that human-centric *search* should punish
misspellings, but rather that encouraging such sloppiness in other
packages is a misfeature, especially if supporting it induces a tax on
*all* users of automated dependency resolution.


Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGpNC++gerLs4ltQ4RAr2HAJ9UdPIVdz36inTG7nkm8SnrWPpcOgCgjKPc
sOqbuwOhUvlsSYpgxFSz1mg=
=F1EY
-----END PGP SIGNATURE-----


From noah.gift at gmail.com  Mon Jul 23 18:37:47 2007
From: noah.gift at gmail.com (Noah Gift)
Date: Mon, 23 Jul 2007 12:37:47 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <46A4D0BE.4030706@palladion.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
	<46A4D0BE.4030706@palladion.com>
Message-ID: <e91cc0270707230937p65d3355cw621eed7e72c96bb4@mail.gmail.com>

>
>
> But by definition, the people typing the names of the dependencies into
> a 'setup.py' for such a plugin *are* Python programmers, and could be
> expected to know about case sensitivity.
>
> I don't think Jim was areguing that human-centric *search* should punish
> misspellings, but rather that encouraging such sloppiness in other
> packages is a misfeature, especially if supporting it induces a tax on
> *all* users of automated dependency resolution.
>
>
In my humble opinion, I for one completely agree with Phillip.  I have had
to sit down with quite a few new Python Programmers and show them how to use
easy_install and I "thank God" easy_install is smart enough to figure out
case sensitivity.  This is a wonderful feature!!!!  Please don't ever get
rid of it :)
Not being able to install a package as they couldn't figure out the exact
name of the package could be the final straw for some new programmer to
Python!

Noah Gift
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/catalog-sig/attachments/20070723/9d7ebe75/attachment.htm 

From barry at python.org  Mon Jul 23 18:46:24 2007
From: barry at python.org (Barry Warsaw)
Date: Mon, 23 Jul 2007 12:46:24 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <46A4D0BE.4030706@palladion.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	<46A23BAE.5090907@v.loewis.de>	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>	<46A259C4.6090605@v.loewis.de>	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
	<46A4D0BE.4030706@palladion.com>
Message-ID: <D0A21FA9-71AE-4631-8032-8507D48D064D@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Jul 23, 2007, at 12:01 PM, Tres Seaver wrote:

>>>> It's said that a 10% improvement in ease-of-use can double a
>>>> product's users.

Under that principle, can I renew my plea for a better name than  
"easy_install"?

- -Barry


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)

iQCVAwUBRqTbYHEjvBPtnXfVAQIHmgP+L5eDz3n4mrcPk5K6NEexQPLrOT9iSd+w
cFYhn+FL5QoK6snRfxFp25KFmdz/raKDeGpQ4ZIy3nhpZTqxeQpPCsAg84rrw0lQ
lflPXkMMmZJTi+3JmjXc2mhj2SlHZ+73XxRPcD2NKnqr14sxlunJMPe4/IX+y1Rf
9C5WVwoCiJ0=
=b+zs
-----END PGP SIGNATURE-----

From jodok at lovelysystems.com  Mon Jul 23 19:56:45 2007
From: jodok at lovelysystems.com (Jodok Batlogg)
Date: Mon, 23 Jul 2007 13:56:45 -0400
Subject: [Catalog-sig] setuptools upload to pypi
Message-ID: <C9BBF98B-C127-40AD-8888-AF4E60EE9D64@lovelysystems.com>

hi,

i can't upload a new egg to cheeseshop...

running "python setup.py bdist_egg register upload" hangs for several  
minutes at "Using PyPI login from /Users/jodok/.pypirc".
entering username and password interactively results in the same.
the webinterface seems to work fine (at least browsing)

any idea?

thanks

jodok

--
"Now is better than never."
   -- The Zen of Python, by Tim Peters

Jodok Batlogg, Lovely Systems
Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria
phone: +43 5572 908060, fax: +43 5572 908060-77


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2454 bytes
Desc: not available
Url : http://mail.python.org/pipermail/catalog-sig/attachments/20070723/2e31b7b4/attachment.bin 

From kantrn at rpi.edu  Mon Jul 23 20:28:27 2007
From: kantrn at rpi.edu (Noah Kantrowitz)
Date: Mon, 23 Jul 2007 14:28:27 -0400
Subject: [Catalog-sig] setuptools upload to pypi
In-Reply-To: <C9BBF98B-C127-40AD-8888-AF4E60EE9D64@lovelysystems.com>
References: <C9BBF98B-C127-40AD-8888-AF4E60EE9D64@lovelysystems.com>
Message-ID: <46A4F34B.4090004@rpi.edu>

I've been seeing that this morning too. Uploads work fine, its just the
register that seems to fail.

--Noah

Jodok Batlogg wrote:
> hi,
>
> i can't upload a new egg to cheeseshop...
>
> running "python setup.py bdist_egg register upload" hangs for several
> minutes at "Using PyPI login from /Users/jodok/.pypirc".
> entering username and password interactively results in the same.
> the webinterface seems to work fine (at least browsing)
>
> any idea?
>
> thanks
>
> jodok
>
> -- 
> "Now is better than never."
>   -- The Zen of Python, by Tim Peters
>
> Jodok Batlogg, Lovely Systems
> Schmelzh?tterstra?e 26a, 6850 Dornbirn, Austria
> phone: +43 5572 908060, fax: +43 5572 908060-77
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>   


From tseaver at palladion.com  Mon Jul 23 20:48:40 2007
From: tseaver at palladion.com (Tres Seaver)
Date: Mon, 23 Jul 2007 14:48:40 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <e91cc0270707230937p65d3355cw621eed7e72c96bb4@mail.gmail.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	<46A23BAE.5090907@v.loewis.de>	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>	<46A259C4.6090605@v.loewis.de>	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>	<46A4D0BE.4030706@palladion.com>
	<e91cc0270707230937p65d3355cw621eed7e72c96bb4@mail.gmail.com>
Message-ID: <46A4F808.4050406@palladion.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Noah Gift wrote:
>>
>> But by definition, the people typing the names of the dependencies into
>> a 'setup.py' for such a plugin *are* Python programmers, and could be
>> expected to know about case sensitivity.
>>
>> I don't think Jim was areguing that human-centric *search* should punish
>> misspellings, but rather that encouraging such sloppiness in other
>> packages is a misfeature, especially if supporting it induces a tax on
>> *all* users of automated dependency resolution.
>>
>>
> In my humble opinion, I for one completely agree with Phillip.  I have had
> to sit down with quite a few new Python Programmers and show them how to use
> easy_install and I "thank God" easy_install is smart enough to figure out
> case sensitivity.  This is a wonderful feature!!!!  Please don't ever get
> rid of it :)
> Not being able to install a package as they couldn't figure out the exact
> name of the package could be the final straw for some new programmer to
> Python!

There are two different use cases here:

 1. User mis-types the name of a package on the command line, e.g.:

     $ easy_install Foo

    when it should be spelled:

     $ easy_install foo

    Being forgiving of case-mangling here ia a concern of the
    easy_install *application*, and is non-controversil.

 2. Programmer mis-types the name of a package in the dependencies
    for his own pacakge, e.g.:

      setup(install_requires=['Foo']...)

    In this case, coddling the error causes it to *propagate*, becuase
    other programmers will copy it directly, or depend on the error-
    filled package.  Worse, the cost of error correction is transferred
    to *all* users of the setuptools library, even if they never use
    'easy_install' at all.

I'm fine with leaving the newbie-friendly behavior in 'easy_install';  I
just don't like the performance hit it induces on users of setuptools
who *can* spell.


Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGpPgI+gerLs4ltQ4RApzMAJ0WP6gzaM8n99fxkyo0Se285Te3bQCg1vxF
6ihYIENH8GpsQ7/ZF062T4Q=
=OuxU
-----END PGP SIGNATURE-----

From tseaver at palladion.com  Mon Jul 23 20:48:40 2007
From: tseaver at palladion.com (Tres Seaver)
Date: Mon, 23 Jul 2007 14:48:40 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <e91cc0270707230937p65d3355cw621eed7e72c96bb4@mail.gmail.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	<46A23BAE.5090907@v.loewis.de>	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>	<46A259C4.6090605@v.loewis.de>	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>	<46A4D0BE.4030706@palladion.com>
	<e91cc0270707230937p65d3355cw621eed7e72c96bb4@mail.gmail.com>
Message-ID: <46A4F808.4050406@palladion.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Noah Gift wrote:
>>
>> But by definition, the people typing the names of the dependencies into
>> a 'setup.py' for such a plugin *are* Python programmers, and could be
>> expected to know about case sensitivity.
>>
>> I don't think Jim was areguing that human-centric *search* should punish
>> misspellings, but rather that encouraging such sloppiness in other
>> packages is a misfeature, especially if supporting it induces a tax on
>> *all* users of automated dependency resolution.
>>
>>
> In my humble opinion, I for one completely agree with Phillip.  I have had
> to sit down with quite a few new Python Programmers and show them how to use
> easy_install and I "thank God" easy_install is smart enough to figure out
> case sensitivity.  This is a wonderful feature!!!!  Please don't ever get
> rid of it :)
> Not being able to install a package as they couldn't figure out the exact
> name of the package could be the final straw for some new programmer to
> Python!

There are two different use cases here:

 1. User mis-types the name of a package on the command line, e.g.:

     $ easy_install Foo

    when it should be spelled:

     $ easy_install foo

    Being forgiving of case-mangling here ia a concern of the
    easy_install *application*, and is non-controversil.

 2. Programmer mis-types the name of a package in the dependencies
    for his own pacakge, e.g.:

      setup(install_requires=['Foo']...)

    In this case, coddling the error causes it to *propagate*, becuase
    other programmers will copy it directly, or depend on the error-
    filled package.  Worse, the cost of error correction is transferred
    to *all* users of the setuptools library, even if they never use
    'easy_install' at all.

I'm fine with leaving the newbie-friendly behavior in 'easy_install';  I
just don't like the performance hit it induces on users of setuptools
who *can* spell.


Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGpPgI+gerLs4ltQ4RApzMAJ0WP6gzaM8n99fxkyo0Se285Te3bQCg1vxF
6ihYIENH8GpsQ7/ZF062T4Q=
=OuxU
-----END PGP SIGNATURE-----


From benji at benjiyork.com  Mon Jul 23 20:54:27 2007
From: benji at benjiyork.com (Benji York)
Date: Mon, 23 Jul 2007 14:54:27 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A384C9.8040404@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	<46A23BAE.5090907@v.loewis.de>	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>	<46A259C4.6090605@v.loewis.de>	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<46A384C9.8040404@v.loewis.de>
Message-ID: <46A4F963.3040609@benjiyork.com>

Martin v. L?wis wrote:
> would they prefer instead that it is super-efficient (and somewhat
> out-of-date)?

Yes.  At most a few minutes out of date and faster/more reliable would
be my strong preference.
-- 
Benji York
http://benjiyork.com

From jim at zope.com  Mon Jul 23 20:55:16 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 23 Jul 2007 14:55:16 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <46A4F808.4050406@palladion.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	<46A23BAE.5090907@v.loewis.de>	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>	<46A259C4.6090605@v.loewis.de>	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>	<46A4D0BE.4030706@palladion.com>
	<e91cc0270707230937p65d3355cw621eed7e72c96bb4@mail.gmail.com>
	<46A4F808.4050406@palladion.com>
Message-ID: <9FFADEB3-0E83-417E-B6EE-AF9A172690D0@zope.com>


On Jul 23, 2007, at 2:48 PM, Tres Seaver wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Noah Gift wrote:
>>>
>>> But by definition, the people typing the names of the  
>>> dependencies into
>>> a 'setup.py' for such a plugin *are* Python programmers, and  
>>> could be
>>> expected to know about case sensitivity.
>>>
>>> I don't think Jim was areguing that human-centric *search* should  
>>> punish
>>> misspellings, but rather that encouraging such sloppiness in other
>>> packages is a misfeature, especially if supporting it induces a  
>>> tax on
>>> *all* users of automated dependency resolution.
>>>
>>>
>> In my humble opinion, I for one completely agree with Phillip.  I  
>> have had
>> to sit down with quite a few new Python Programmers and show them  
>> how to use
>> easy_install and I "thank God" easy_install is smart enough to  
>> figure out
>> case sensitivity.  This is a wonderful feature!!!!  Please don't  
>> ever get
>> rid of it :)
>> Not being able to install a package as they couldn't figure out  
>> the exact
>> name of the package could be the final straw for some new  
>> programmer to
>> Python!
>
> There are two different use cases here:
>
>  1. User mis-types the name of a package on the command line, e.g.:
>
>      $ easy_install Foo
>
>     when it should be spelled:
>
>      $ easy_install foo
>
>     Being forgiving of case-mangling here ia a concern of the
>     easy_install *application*, and is non-controversil.

For me this is potentially controversial because:

>  2. Programmer mis-types the name of a package in the dependencies
>     for his own pacakge, e.g.:
>
>       setup(install_requires=['Foo']...)

Note that this might be intentional, as opposed to a typo.  The  
programmer will think "Foo" is a valid name because it worked with  
easy_install.  It's true that easy_install prints a warning, but it  
is buried in so much output that it is easily missed or ignored.

>     In this case, coddling the error causes it to *propagate*, becuase
>     other programmers will copy it directly, or depend on the error-
>     filled package.  Worse, the cost of error correction is  
> transferred
>     to *all* users of the setuptools library, even if they never use
>     'easy_install' at all.

Well said.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From benji at benjiyork.com  Mon Jul 23 20:58:44 2007
From: benji at benjiyork.com (Benji York)
Date: Mon, 23 Jul 2007 14:58:44 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A38DF5.6010701@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	<46A23BAE.5090907@v.loewis.de>	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>	<46A259C4.6090605@v.loewis.de>	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>	<46A384C9.8040404@v.loewis.de>	<E31628FD-9208-4E22-88FF-BAAB86956994@zope.com>
	<46A38DF5.6010701@v.loewis.de>
Message-ID: <46A4FA64.5050404@benjiyork.com>

Martin v. L?wis wrote:
> And where somewhat slower could be "practically not noticable".

Perhaps it /could/ be, but isn't currently.  For example, updating one 
piece of software I have with almost 150 dependencies takes 45 seconds 
with ppix, 4:45 without.  I plan to do similar timings with the "simple" 
PyPI interface when I get a chance and report the results here.
-- 
Benji York
http://benjiyork.com

From jim at zope.com  Mon Jul 23 21:06:46 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 23 Jul 2007 15:06:46 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A4FA64.5050404@benjiyork.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	<46A23BAE.5090907@v.loewis.de>	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>	<46A259C4.6090605@v.loewis.de>	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>	<46A384C9.8040404@v.loewis.de>	<E31628FD-9208-4E22-88FF-BAAB86956994@zope.com>
	<46A38DF5.6010701@v.loewis.de> <46A4FA64.5050404@benjiyork.com>
Message-ID: <4B15F81D-3980-47FD-AC61-47F8E1EED20F@zope.com>


On Jul 23, 2007, at 2:58 PM, Benji York wrote:

> Martin v. L?wis wrote:
>> And where somewhat slower could be "practically not noticable".
>
> Perhaps it /could/ be, but isn't currently.  For example, updating  
> one piece of software I have with almost 150 dependencies takes 45  
> seconds with ppix, 4:45 without.  I plan to do similar timings with  
> the "simple" PyPI interface when I get a chance and report the  
> results here.

I suspect that this has more to do with network distance than with  
server speed.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From noah.gift at gmail.com  Mon Jul 23 21:30:44 2007
From: noah.gift at gmail.com (Noah Gift)
Date: Mon, 23 Jul 2007 15:30:44 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <4B15F81D-3980-47FD-AC61-47F8E1EED20F@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<46A384C9.8040404@v.loewis.de>
	<E31628FD-9208-4E22-88FF-BAAB86956994@zope.com>
	<46A38DF5.6010701@v.loewis.de> <46A4FA64.5050404@benjiyork.com>
	<4B15F81D-3980-47FD-AC61-47F8E1EED20F@zope.com>
Message-ID: <e91cc0270707231230w64322529h6ecb2858007cd449@mail.gmail.com>

On 7/23/07, Jim Fulton <jim at zope.com> wrote:
>
>
> On Jul 23, 2007, at 2:58 PM, Benji York wrote:
>
> > Martin v. L?wis wrote:
> >> And where somewhat slower could be "practically not noticable".
> >
> > Perhaps it /could/ be, but isn't currently.  For example, updating
> > one piece of software I have with almost 150 dependencies takes 45
> > seconds with ppix, 4:45 without.  I plan to do similar timings with
> > the "simple" PyPI interface when I get a chance and report the
> > results here.
>
> I suspect that this has more to do with network distance than with
> server speed.


That is an interesting point.  It is amazing how many directory type things
get slammed, but the problem is really latency...such as a slow DNS lookup.
 I wonder how much quicker an easy_install would be will local DNS
lookups,package names, etc.

I had a problem with a LDAP server I setup that was really tricky to figure
out until I wrote some scripts that ran continuously getting stats, and I
realized that a DNS server would hang occasionally and it would grind
everything to a halt.  People kept telling me they would have an occasional
'ls -l' that would hang for 20 seconds.  Caching DNS servers fixed it.


Jim
>
> --
> Jim Fulton                      mailto:jim at zope.com             Python
> Powered!
> CTO                             (540) 361-1714
> http://www.python.org
> Zope Corporation        http://www.zope.com
> http://www.zope.org
>
>
>
> _______________________________________________
> Distutils-SIG maillist  -  Distutils-SIG at python.org
> http://mail.python.org/mailman/listinfo/distutils-sig
>



-- 
http://www.blog.noahgift.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/catalog-sig/attachments/20070723/af718f06/attachment.html 

From benji at benjiyork.com  Mon Jul 23 21:41:05 2007
From: benji at benjiyork.com (Benji York)
Date: Mon, 23 Jul 2007 15:41:05 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <4B15F81D-3980-47FD-AC61-47F8E1EED20F@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	<46A23BAE.5090907@v.loewis.de>	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>	<46A259C4.6090605@v.loewis.de>	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>	<46A384C9.8040404@v.loewis.de>	<E31628FD-9208-4E22-88FF-BAAB86956994@zope.com>
	<46A38DF5.6010701@v.loewis.de> <46A4FA64.5050404@benjiyork.com>
	<4B15F81D-3980-47FD-AC61-47F8E1EED20F@zope.com>
Message-ID: <46A50451.5050908@benjiyork.com>

Jim Fulton wrote:
> On Jul 23, 2007, at 2:58 PM, Benji York wrote:
> 
>> Martin v. L?wis wrote:
>>> And where somewhat slower could be "practically not noticable".
>> Perhaps it /could/ be, but isn't currently.  For example, updating  
>> one piece of software I have with almost 150 dependencies takes 45  
>> seconds with ppix, 4:45 without.  I plan to do similar timings with  
>> the "simple" PyPI interface when I get a chance and report the  
>> results here.
> 
> I suspect that this has more to do with network distance than with  
> server speed.

That's actually my point.  Geographically distributed mirrors that are a 
little out of sync are much more valuable (IMO) than a centralized 
service that is absolutely up to date, but "far" away.  For me the 
static/dynamic argument is more about stability, and central/distributed 
is more about (network) speed.
-- 
Benji York
http://benjiyork.com

From benji at benjiyork.com  Mon Jul 23 21:02:08 2007
From: benji at benjiyork.com (Benji York)
Date: Mon, 23 Jul 2007 15:02:08 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
 index.
In-Reply-To: <799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	<46A23BAE.5090907@v.loewis.de>	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>	<46A259C4.6090605@v.loewis.de>	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
Message-ID: <46A4FB30.2000304@benjiyork.com>

Jim Fulton wrote:
> On Jul 22, 2007, at 12:51 PM, Phillip J. Eby wrote:
>> A significant part of setuptools' audience includes people who are  
>> new to Python, or at least new to installing or distributing Python  
>> modules, and quite a lot of setuptools features are aimed squarely  
>> at that audience.  This happens to be one of them.
> 
> I don't think that encouraging use of case insensitive names by  
> people who are about start learning a language that uses case  
> sensitive names is doing them any favors.

Agreed.
-- 
Benji York
http://benjiyork.com

From benji at benjiyork.com  Mon Jul 23 21:05:42 2007
From: benji at benjiyork.com (Benji York)
Date: Mon, 23 Jul 2007 15:05:42 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
 index.
In-Reply-To: <e91cc0270707230937p65d3355cw621eed7e72c96bb4@mail.gmail.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	<46A23BAE.5090907@v.loewis.de>	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>	<46A259C4.6090605@v.loewis.de>	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>	<46A4D0BE.4030706@palladion.com>
	<e91cc0270707230937p65d3355cw621eed7e72c96bb4@mail.gmail.com>
Message-ID: <46A4FC06.3010109@benjiyork.com>

Noah Gift wrote:
> In my humble opinion, I for one completely agree with Phillip.  I have had to 
> sit down with quite a few new Python Programmers and show them how to use 
> easy_install and I "thank God" easy_install is smart enough to figure out case 
> sensitivity.  This is a wonderful feature!!!!  Please don't ever get rid of it :)

If easy_install had instead said "sorry, I can't find 'foo', perhaps you 
meant 'Foo'", then the user would be both spared frustration and 
enlightened.
-- 
Benji York
http://benjiyork.com

From martin at v.loewis.de  Mon Jul 23 22:00:12 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 23 Jul 2007 22:00:12 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <617C738B-BDB4-4EDE-900E-64B50EFC2ED6@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A3A485.7060602@v.loewis.de>
	<617C738B-BDB4-4EDE-900E-64B50EFC2ED6@zope.com>
Message-ID: <46A508CC.5010706@v.loewis.de>

> Yup. So, ATM:
> 
>   Using old PyPI takes about 1m5s
>   Using simple takes about       25s
>   Using ppix  takes about            8s

Thanks!

> Again, I assume that the difference between simple and ppix has more to
> do with geography than the difference between serving statically and
> dynamically. The simple page has more links on it than the ppix page,
> because I haven't gotten around to scarf all links off of a
> restructured-text rendering of long description.  I doubt that makes any
> difference.  It will be interesting to try again after I fix that.

If you think that the /simple pages are correct, it might be easier to
just mirror them instead of doing all the work yourself.

I don't plan to take that service offline, unless experimentation
shows it has serious flaws.

Regards,
Martin

From martin at v.loewis.de  Mon Jul 23 22:04:36 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 23 Jul 2007 22:04:36 +0200
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <46A4D0BE.4030706@palladion.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	<46A23BAE.5090907@v.loewis.de>	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>	<46A259C4.6090605@v.loewis.de>	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
	<46A4D0BE.4030706@palladion.com>
Message-ID: <46A509D4.3070108@v.loewis.de>

> But by definition, the people typing the names of the dependencies into
> a 'setup.py' for such a plugin *are* Python programmers, and could be
> expected to know about case sensitivity.
> 
> I don't think Jim was areguing that human-centric *search* should punish
> misspellings, but rather that encouraging such sloppiness in other
> packages is a misfeature, especially if supporting it induces a tax on
> *all* users of automated dependency resolution.

Right. I think Phillip is primarily talking about package names as
specified on the command line of easy_install.

So if your concern is about package names specified in dependencies,
one solution could be that setuptools distinguishes whether to apply
case corrections and normalization, depending on whether it was an
end-user-typed name or a programmer-specified one.

What I don't know is how difficult that would be to implement, and
what volunteer is supposed to implement it if it were easy/possible,
so I by no means propose that such a solution should be implemented,
even if it would solve the problem.

Regards,
Martin

From rlratzel at enthought.com  Mon Jul 23 22:04:18 2007
From: rlratzel at enthought.com (Rick Ratzel)
Date: Mon, 23 Jul 2007 15:04:18 -0500 (CDT)
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
 index.
In-Reply-To: <46A4FC06.3010109@benjiyork.com> (message from Benji York on Mon, 
	23 Jul 2007 15:05:42 -0400)
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	<46A23BAE.5090907@v.loewis.de>	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>	<46A259C4.6090605@v.loewis.de>	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>	<46A4D0BE.4030706@palladion.com>
	<e91cc0270707230937p65d3355cw621eed7e72c96bb4@mail.gmail.com>
	<46A4FC06.3010109@benjiyork.com>
Message-ID: <20070723200419.058451DF4F6@mail.enthought.com>


>    Date: Mon, 23 Jul 2007 15:05:42 -0400
>    From: Benji York <benji at benjiyork.com>
> 
>    Noah Gift wrote:
>    > In my humble opinion, I for one completely agree with Phillip.  I have had to 
>    > sit down with quite a few new Python Programmers and show them how to use 
>    > easy_install and I "thank God" easy_install is smart enough to figure out case 
>    > sensitivity.  This is a wonderful feature!!!!  Please don't ever get rid of it :)
> 
>    If easy_install had instead said "sorry, I can't find 'foo', perhaps you 
>    meant 'Foo'", then the user would be both spared frustration and 
>    enlightened.

  +1

-- 
Rick Ratzel - Enthought, Inc.
515 Congress Avenue, Suite 2100 - Austin, Texas 78701
512-536-1057 x229 - Fax: 512-536-1059
http://www.enthought.com


From jim at zope.com  Mon Jul 23 22:05:53 2007
From: jim at zope.com (Jim Fulton)
Date: Mon, 23 Jul 2007 16:05:53 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A508CC.5010706@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A3A485.7060602@v.loewis.de>
	<617C738B-BDB4-4EDE-900E-64B50EFC2ED6@zope.com>
	<46A508CC.5010706@v.loewis.de>
Message-ID: <C9AD6F9B-1F94-4479-92E2-14B5F289BF92@zope.com>


On Jul 23, 2007, at 4:00 PM, Martin v. L?wis wrote:
...
>> Again, I assume that the difference between simple and ppix has  
>> more to
>> do with geography than the difference between serving statically and
>> dynamically. The simple page has more links on it than the ppix page,
>> because I haven't gotten around to scarf all links off of a
>> restructured-text rendering of long description.  I doubt that  
>> makes any
>> difference.  It will be interesting to try again after I fix that.
>
> If you think that the /simple pages are correct, it might be easier to
> just mirror them instead of doing all the work yourself.

Good point. I might just do that.

> I don't plan to take that service offline, unless experimentation
> shows it has serious flaws.

Cool.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From martin at v.loewis.de  Mon Jul 23 22:13:37 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 23 Jul 2007 22:13:37 +0200
Subject: [Catalog-sig] setuptools upload to pypi
In-Reply-To: <C9BBF98B-C127-40AD-8888-AF4E60EE9D64@lovelysystems.com>
References: <C9BBF98B-C127-40AD-8888-AF4E60EE9D64@lovelysystems.com>
Message-ID: <46A50BF1.9020303@v.loewis.de>

> i can't upload a new egg to cheeseshop...
> 
> running "python setup.py bdist_egg register upload" hangs for several
> minutes at "Using PyPI login from /Users/jodok/.pypirc".
> entering username and password interactively results in the same.
> the webinterface seems to work fine (at least browsing)
> 
> any idea?

I think that's because I turned of proxying from www.python.org/pypi
to cheeseshop.python.org/pypi, and replaced it with redirection
(302, temporary redirect) instead (temporary just in case people
find problems with that).

(I asked a few days ago whether that would be a problem, and nobody
said it would).

I'd appreciate if somebody could investigate what precisely
is causing the problem (I thought urllib[2] would be able to
handle redirects), how to fix it, and propose a fix to the
code base.

I have now reverted the change (which, of course, gives a
performance problem, as all accesses to www.python.org/pypi
now go through two web servers).

Regards,
Martin

From martin at v.loewis.de  Mon Jul 23 22:16:27 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 23 Jul 2007 22:16:27 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A4FA64.5050404@benjiyork.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	<46A23BAE.5090907@v.loewis.de>	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>	<46A259C4.6090605@v.loewis.de>	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>	<46A384C9.8040404@v.loewis.de>	<E31628FD-9208-4E22-88FF-BAAB86956994@zope.com>
	<46A38DF5.6010701@v.loewis.de> <46A4FA64.5050404@benjiyork.com>
Message-ID: <46A50C9B.7060501@v.loewis.de>

> Perhaps it /could/ be, but isn't currently.  For example, updating one
> piece of software I have with almost 150 dependencies takes 45 seconds
> with ppix, 4:45 without.  I plan to do similar timings with the "simple"
> PyPI interface when I get a chance and report the results here.


I was, of course, talking about the simple interface.

The full index will certainly take much more time because setuptools
has to request more pages, and each page contains a lot of unnecessary
data.

Regards,
Martin

From martin at v.loewis.de  Mon Jul 23 22:21:05 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 23 Jul 2007 22:21:05 +0200
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A50451.5050908@benjiyork.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	<46A23BAE.5090907@v.loewis.de>	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>	<46A259C4.6090605@v.loewis.de>	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>	<46A384C9.8040404@v.loewis.de>	<E31628FD-9208-4E22-88FF-BAAB86956994@zope.com>
	<46A38DF5.6010701@v.loewis.de> <46A4FA64.5050404@benjiyork.com>
	<4B15F81D-3980-47FD-AC61-47F8E1EED20F@zope.com>
	<46A50451.5050908@benjiyork.com>
Message-ID: <46A50DB1.3080207@v.loewis.de>

>>>> And where somewhat slower could be "practically not noticable".
>>> Perhaps it /could/ be, but isn't currently.  For example, updating 
>>> one piece of software I have with almost 150 dependencies takes 45 
>>> seconds with ppix, 4:45 without.  I plan to do similar timings with 
>>> the "simple" PyPI interface when I get a chance and report the 
>>> results here.
>>
>> I suspect that this has more to do with network distance than with 
>> server speed.
> 
> That's actually my point.  Geographically distributed mirrors that are a
> little out of sync are much more valuable (IMO) than a centralized
> service that is absolutely up to date, but "far" away.

Ok, but then your response didn't really answer my question.

If people want to run distributed mirrors that are somewhat behind,
by all means: start today (just remember to talk to me if you also
want to mirror files - if not, just run Jim's software as-is).

My question was about the "simple" interface on the central
server, to which you seem to say "I don't need it at all - whether
it's current and slow or behind and fast" (which, in a sense,
is also a response to the question, namely "I don't care").

Regards,
Martin

From pje at telecommunity.com  Mon Jul 23 22:43:40 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 23 Jul 2007 16:43:40 -0400
Subject: [Catalog-sig] setuptools upload to pypi
In-Reply-To: <46A50BF1.9020303@v.loewis.de>
References: <C9BBF98B-C127-40AD-8888-AF4E60EE9D64@lovelysystems.com>
	<46A50BF1.9020303@v.loewis.de>
Message-ID: <20070723204445.65ABC3A40AA@sparrow.telecommunity.com>

At 10:13 PM 7/23/2007 +0200, Martin v. L?wis wrote:
> > i can't upload a new egg to cheeseshop...
> >
> > running "python setup.py bdist_egg register upload" hangs for several
> > minutes at "Using PyPI login from /Users/jodok/.pypirc".
> > entering username and password interactively results in the same.
> > the webinterface seems to work fine (at least browsing)
> >
> > any idea?
>
>I think that's because I turned of proxying from www.python.org/pypi
>to cheeseshop.python.org/pypi, and replaced it with redirection
>(302, temporary redirect) instead (temporary just in case people
>find problems with that).

If you were doing that for POST requests, that is probably the source 
of the problem.  You could always restrict the proxying to occur only 
for non-GET requests, since IIRC distutils.command.register and 
distutils.command.upload use POSTs.  GET requests generally have a 
much wider leeway for safe redirection than POST requests do.

Of course, one must also preserve the query string in a redirected 
GET, and I don't think Apache's Redirect directive does that 
either.  You can certainly do it with mod_rewrite, however.

I expect that the combination of preserving query strings on 
redirection, and only redirecting GETs should make the transition safe.


From pje at telecommunity.com  Mon Jul 23 22:47:04 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 23 Jul 2007 16:47:04 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <46A509D4.3070108@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
	<46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de>
Message-ID: <20070723204446.294223A40B2@sparrow.telecommunity.com>

At 10:04 PM 7/23/2007 +0200, Martin v. L?wis wrote:
> > But by definition, the people typing the names of the dependencies into
> > a 'setup.py' for such a plugin *are* Python programmers, and could be
> > expected to know about case sensitivity.
> >
> > I don't think Jim was areguing that human-centric *search* should punish
> > misspellings, but rather that encouraging such sloppiness in other
> > packages is a misfeature, especially if supporting it induces a tax on
> > *all* users of automated dependency resolution.
>
>Right. I think Phillip is primarily talking about package names as
>specified on the command line of easy_install.
>
>So if your concern is about package names specified in dependencies,
>one solution could be that setuptools distinguishes whether to apply
>case corrections and normalization, depending on whether it was an
>end-user-typed name or a programmer-specified one.
>
>What I don't know is how difficult that would be to implement, and
>what volunteer is supposed to implement it if it were easy/possible,
>so I by no means propose that such a solution should be implemented,
>even if it would solve the problem.

Yes, especially since compatibility with the existing installation 
base requires case insensitivity, because on case-insensitive 
platforms easy_install already normalizes the case of filenames it 
creates.  So, the question of what the "right thing" to do is in the 
abstract has already been moot for a year or two.


From martin at v.loewis.de  Mon Jul 23 23:03:20 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 23 Jul 2007 23:03:20 +0200
Subject: [Catalog-sig] setuptools upload to pypi
In-Reply-To: <20070723204445.65ABC3A40AA@sparrow.telecommunity.com>
References: <C9BBF98B-C127-40AD-8888-AF4E60EE9D64@lovelysystems.com>
	<46A50BF1.9020303@v.loewis.de>
	<20070723204445.65ABC3A40AA@sparrow.telecommunity.com>
Message-ID: <46A51798.8000907@v.loewis.de>

> because I turned of proxying from www.python.org/pypi
>> to cheeseshop.python.org/pypi, and replaced it with redirection
>> (302, temporary redirect) instead (temporary just in case people
>> find problems with that).
> 
> If you were doing that for POST requests, that is probably the source of
> the problem.  You could always restrict the proxying to occur only for
> non-GET requests, since IIRC distutils.command.register and
> distutils.command.upload use POSTs.  GET requests generally have a much
> wider leeway for safe redirection than POST requests do.

What is the problem with redirects for POST? In particular, why doesn't
urllib2 support it?

> Of course, one must also preserve the query string in a redirected GET,
> and I don't think Apache's Redirect directive does that either.  You can
> certainly do it with mod_rewrite, however.

I see - I was using a plain Redirect.

> I expect that the combination of preserving query strings on
> redirection, and only redirecting GETs should make the transition safe.

Can you share the magic to do that? I'd really like to start phasing
out www.python.org/pypi, although I now see that it will take a few
Python releases to get the cheeseshop home page replaced in distutils.

In particular, if I also keep the mod_proxy setup for the reverse
proxy, how will it interact with the redirect for the GET only?

Regards,
Martin

From fdrake at gmail.com  Mon Jul 23 23:13:18 2007
From: fdrake at gmail.com (Fred Drake)
Date: Mon, 23 Jul 2007 17:13:18 -0400
Subject: [Catalog-sig] setuptools upload to pypi
In-Reply-To: <46A50BF1.9020303@v.loewis.de>
References: <C9BBF98B-C127-40AD-8888-AF4E60EE9D64@lovelysystems.com>
	<46A50BF1.9020303@v.loewis.de>
Message-ID: <9cee7ab80707231413q573c62bas8a03163e03ba9fc1@mail.gmail.com>

On 7/23/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> I think that's because I turned of proxying from www.python.org/pypi
> to cheeseshop.python.org/pypi, and replaced it with redirection
> (302, temporary redirect) instead (temporary just in case people
> find problems with that).
>
> (I asked a few days ago whether that would be a problem, and nobody
> said it would).

I guess I just didn't find the time, but my objections are
non-technical and have apparently been of no interest when voiced in
the past.

Basically, I think exposing human beings to the name "cheeseshop" in
bad.  Specifically, it's confusing to anyone not familiar with a
particular Monty Python skit.  A nice skit (IMO), but not a good
public-facing name for PyPI.


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at gmail.com>
"Chaos is the score upon which reality is written." --Henry Miller

From martin at v.loewis.de  Mon Jul 23 23:13:48 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 23 Jul 2007 23:13:48 +0200
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
 index.
In-Reply-To: <20070723204446.294223A40B2@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	<46A23BAE.5090907@v.loewis.de>	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>	<46A259C4.6090605@v.loewis.de>	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>	<46A4D0BE.4030706@palladion.com>
	<46A509D4.3070108@v.loewis.de>
	<20070723204446.294223A40B2@sparrow.telecommunity.com>
Message-ID: <46A51A0C.2090800@v.loewis.de>

> Yes, especially since compatibility with the existing installation 
> base requires case insensitivity, because on case-insensitive 
> platforms easy_install already normalizes the case of filenames it 
> creates.  So, the question of what the "right thing" to do is in the 
> abstract has already been moot for a year or two.

Can you elaborate a bit, please? Why does the case of filenames
matter for the queries it makes?

AFAIU, it gets package names either from the user or from setup.py,
perhaps also from packages dependency inside .egg files (assuming
those support dependencies); these should all be case-sensitive.

Regards,
Martin

From pje at telecommunity.com  Mon Jul 23 23:21:16 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 23 Jul 2007 17:21:16 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
 index.
In-Reply-To: <46A51A0C.2090800@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
	<46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de>
	<20070723204446.294223A40B2@sparrow.telecommunity.com>
	<46A51A0C.2090800@v.loewis.de>
Message-ID: <20070723211919.361E23A403D@sparrow.telecommunity.com>

At 11:13 PM 7/23/2007 +0200, Martin v. L?wis wrote:
> > Yes, especially since compatibility with the existing installation
> > base requires case insensitivity, because on case-insensitive
> > platforms easy_install already normalizes the case of filenames it
> > creates.  So, the question of what the "right thing" to do is in the
> > abstract has already been moot for a year or two.
>
>Can you elaborate a bit, please? Why does the case of filenames
>matter for the queries it makes?
>
>AFAIU, it gets package names either from the user or from setup.py,
>perhaps also from packages dependency inside .egg files (assuming
>those support dependencies); these should all be case-sensitive.

In order to resolve dependencies, the system looks at installed .egg 
files and directories (and .egg-info direcories), and extracts 
package name and version info from the filenames. 


From benji at benjiyork.com  Mon Jul 23 23:26:39 2007
From: benji at benjiyork.com (Benji York)
Date: Mon, 23 Jul 2007 17:26:39 -0400
Subject: [Catalog-sig] setuptools upload to pypi
In-Reply-To: <9cee7ab80707231413q573c62bas8a03163e03ba9fc1@mail.gmail.com>
References: <C9BBF98B-C127-40AD-8888-AF4E60EE9D64@lovelysystems.com>	<46A50BF1.9020303@v.loewis.de>
	<9cee7ab80707231413q573c62bas8a03163e03ba9fc1@mail.gmail.com>
Message-ID: <46A51D0F.30406@benjiyork.com>

Fred Drake wrote:
> Basically, I think exposing human beings to the name "cheeseshop" [is] 
> bad. [...] A nice skit (IMO), but not a good public-facing name for
> PyPI.

I have to agree on both counts.
-- 
Benji York
http://benjiyork.com

From pje at telecommunity.com  Mon Jul 23 23:31:39 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 23 Jul 2007 17:31:39 -0400
Subject: [Catalog-sig] setuptools upload to pypi
In-Reply-To: <46A51798.8000907@v.loewis.de>
References: <C9BBF98B-C127-40AD-8888-AF4E60EE9D64@lovelysystems.com>
	<46A50BF1.9020303@v.loewis.de>
	<20070723204445.65ABC3A40AA@sparrow.telecommunity.com>
	<46A51798.8000907@v.loewis.de>
Message-ID: <20070723212920.8BFE63A40AA@sparrow.telecommunity.com>

At 11:03 PM 7/23/2007 +0200, Martin v. L?wis wrote:
> > because I turned of proxying from www.python.org/pypi
> >> to cheeseshop.python.org/pypi, and replaced it with redirection
> >> (302, temporary redirect) instead (temporary just in case people
> >> find problems with that).
> >
> > If you were doing that for POST requests, that is probably the source of
> > the problem.  You could always restrict the proxying to occur only for
> > non-GET requests, since IIRC distutils.command.register and
> > distutils.command.upload use POSTs.  GET requests generally have a much
> > wider leeway for safe redirection than POST requests do.
>
>What is the problem with redirects for POST? In particular, why doesn't
>urllib2 support it?

It's my understanding that a redirection response to a POST means 
"GET the location I'm giving you", not "sorry, you should POST to 
this other place instead."  At least, that's how I understand web 
browsers to interpret it, and I believe urllib2 does as well.

So, the issue is not one of "not supporting" POSTs, it's a question 
of what the semantics of a redirected POST should be.  As far as I'm 
aware, it doesn't cause the POST to repeat, although that *might* 
depend on the specific status code and HTTP version.


> > Of course, one must also preserve the query string in a redirected GET,
> > and I don't think Apache's Redirect directive does that either.  You can
> > certainly do it with mod_rewrite, however.
>
>I see - I was using a plain Redirect.
>
> > I expect that the combination of preserving query strings on
> > redirection, and only redirecting GETs should make the transition safe.
>
>Can you share the magic to do that? I'd really like to start phasing
>out www.python.org/pypi, although I now see that it will take a few
>Python releases to get the cheeseshop home page replaced in distutils.
>
>In particular, if I also keep the mod_proxy setup for the reverse
>proxy, how will it interact with the redirect for the GET only?

Well, if you are using mod_rewrite to do both the redirection and the 
proxying, then it should suffice to have the GET rewrite with [R] and 
the remainder use [P].

Something like:

RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_METHOD} ^GET$
RewriteRule ^pypi(.*)$ 
http://cheeseshop.python.org/pypi$1?%{QUERY_STRING} [R,L]
RewriteRule ^pypi(.*)$ 
http://cheeseshop.python.org/pypi$1?%{QUERY_STRING} [P,L]

But I'd test that with some dummy URLs instead of 'pypi', 
first.  Notice that this is not using any mod_proxy directives, just 
using mod_rewrite proxy support.  I've never used the mod_proxy 
directives, actually, but I have used mod_rewrite proxying.


From benji at benjiyork.com  Mon Jul 23 23:34:16 2007
From: benji at benjiyork.com (Benji York)
Date: Mon, 23 Jul 2007 17:34:16 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A50DB1.3080207@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	<46A23BAE.5090907@v.loewis.de>	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>	<46A259C4.6090605@v.loewis.de>	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>	<46A384C9.8040404@v.loewis.de>	<E31628FD-9208-4E22-88FF-BAAB86956994@zope.com>
	<46A38DF5.6010701@v.loewis.de> <46A4FA64.5050404@benjiyork.com>
	<4B15F81D-3980-47FD-AC61-47F8E1EED20F@zope.com>
	<46A50451.5050908@benjiyork.com> <46A50DB1.3080207@v.loewis.de>
Message-ID: <46A51ED8.4020502@benjiyork.com>

Martin v. L?wis wrote:
> My question was about the "simple" interface on the central
> server

Ah, I didn't realize.

> to which you seem to say "I don't need it at all - whether
> it's current and slow or behind and fast" (which, in a sense,
> is also a response to the question, namely "I don't care").

I think it's a great idea to have both human- and machine-targeted 
versions available.  It looks like setuptools is about twice as fast (in 
at least one instance) with the simple version.  That seems like a 
pretty big win to me.
-- 
Benji York
http://benjiyork.com

From richardjones at optushome.com.au  Tue Jul 24 00:02:26 2007
From: richardjones at optushome.com.au (Richard Jones)
Date: Tue, 24 Jul 2007 08:02:26 +1000
Subject: [Catalog-sig] setuptools upload to pypi
In-Reply-To: <46A50BF1.9020303@v.loewis.de>
References: <C9BBF98B-C127-40AD-8888-AF4E60EE9D64@lovelysystems.com>
	<46A50BF1.9020303@v.loewis.de>
Message-ID: <200707240802.26694.richardjones@optushome.com.au>

On Tue, 24 Jul 2007, Martin v. L?wis wrote:
> I think that's because I turned of proxying from www.python.org/pypi
> to cheeseshop.python.org/pypi, and replaced it with redirection
> (302, temporary redirect) instead (temporary just in case people
> find problems with that).
>
> (I asked a few days ago whether that would be a problem, and nobody
> said it would).

Sorry, I somehow totally missed your message on this.


     Richard

From pje at telecommunity.com  Tue Jul 24 00:56:51 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 23 Jul 2007 18:56:51 -0400
Subject: [Catalog-sig] setuptools upload to pypi
In-Reply-To: <20070723212920.8BFE63A40AA@sparrow.telecommunity.com>
References: <C9BBF98B-C127-40AD-8888-AF4E60EE9D64@lovelysystems.com>
	<46A50BF1.9020303@v.loewis.de>
	<20070723204445.65ABC3A40AA@sparrow.telecommunity.com>
	<46A51798.8000907@v.loewis.de>
	<20070723212920.8BFE63A40AA@sparrow.telecommunity.com>
Message-ID: <20070723225431.AB1063A40B2@sparrow.telecommunity.com>

At 05:31 PM 7/23/2007 -0400, Phillip J. Eby wrote:
>Something like:
>
>RewriteEngine On
>RewriteBase /
>RewriteCond %{REQUEST_METHOD} ^GET$
>RewriteRule ^pypi(.*)$
>http://cheeseshop.python.org/pypi$1?%{QUERY_STRING} [R,L]
>RewriteRule ^pypi(.*)$
>http://cheeseshop.python.org/pypi$1?%{QUERY_STRING} [P,L]

Ugh.  Looks like those lines wrapped in transit.  The two RewriteRule 
lines should be one line each, with the 'http:' appearing after the 
"^pypi(.*)$" and a space.


From martin at v.loewis.de  Tue Jul 24 06:33:22 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 24 Jul 2007 06:33:22 +0200
Subject: [Catalog-sig] setuptools upload to pypi
In-Reply-To: <9cee7ab80707231413q573c62bas8a03163e03ba9fc1@mail.gmail.com>
References: <C9BBF98B-C127-40AD-8888-AF4E60EE9D64@lovelysystems.com>	
	<46A50BF1.9020303@v.loewis.de>
	<9cee7ab80707231413q573c62bas8a03163e03ba9fc1@mail.gmail.com>
Message-ID: <46A58112.204@v.loewis.de>

> Basically, I think exposing human beings to the name "cheeseshop" in
> bad.  Specifically, it's confusing to anyone not familiar with a
> particular Monty Python skit.  A nice skit (IMO), but not a good
> public-facing name for PyPI.

Ok. However, I think this is a matter of taste, and he who designs
the system gets to name it. Barring a BDFL pronouncement or PSF
board decision, Cheeseshop is the name of that system, whether
people like that name or not.

So I have heard that, but this cannot stop me from fixing what
I consider a technical performance problem (namely, that all
data go through two machines).

Regards,
Martin

From martin at v.loewis.de  Tue Jul 24 06:40:18 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 24 Jul 2007 06:40:18 +0200
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
 index.
In-Reply-To: <20070723211919.361E23A403D@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
	<46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de>
	<20070723204446.294223A40B2@sparrow.telecommunity.com>
	<46A51A0C.2090800@v.loewis.de>
	<20070723211919.361E23A403D@sparrow.telecommunity.com>
Message-ID: <46A582B2.3060105@v.loewis.de>

Phillip J. Eby schrieb:
> At 11:13 PM 7/23/2007 +0200, Martin v. L?wis wrote:
>> > Yes, especially since compatibility with the existing installation
>> > base requires case insensitivity, because on case-insensitive
>> > platforms easy_install already normalizes the case of filenames it
>> > creates.  So, the question of what the "right thing" to do is in the
>> > abstract has already been moot for a year or two.
>>
>> Can you elaborate a bit, please? Why does the case of filenames
>> matter for the queries it makes?
> 
> In order to resolve dependencies, the system looks at installed .egg
> files and directories (and .egg-info direcories), and extracts package
> name and version info from the filenames.

Still - why does that require case-insensitive lookups to the index?

Suppose a package specifies a dependency Foo. IIUC, you look on disk
whether foo is already present, finding the version(s) of foo installed
in that process. Then, this either is satisfying or not. If it is,
you don't need the index at all. If it is not, you need to go to the
index - but you still know that it is Foo that you were looking for,
no? So lookups for dependencies in the index could always be
case-sensitive; please correct me if I'm wrong.

Regards,
Martin


From martin at v.loewis.de  Tue Jul 24 07:52:46 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 24 Jul 2007 07:52:46 +0200
Subject: [Catalog-sig] Cheeseshop webstats
Message-ID: <46A593AE.9030609@v.loewis.de>

For those who are curious, I started collecting webstats, at

http://cheeseshop.python.org/webstats/

Regards,
Martin

From jim at zope.com  Tue Jul 24 12:11:15 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 24 Jul 2007 06:11:15 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <20070723211919.361E23A403D@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
	<46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de>
	<20070723204446.294223A40B2@sparrow.telecommunity.com>
	<46A51A0C.2090800@v.loewis.de>
	<20070723211919.361E23A403D@sparrow.telecommunity.com>
Message-ID: <0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com>


On Jul 23, 2007, at 5:21 PM, Phillip J. Eby wrote:

> At 11:13 PM 7/23/2007 +0200, Martin v. L?wis wrote:
>>> Yes, especially since compatibility with the existing installation
>>> base requires case insensitivity, because on case-insensitive
>>> platforms easy_install already normalizes the case of filenames it
>>> creates.  So, the question of what the "right thing" to do is in the
>>> abstract has already been moot for a year or two.
>>
>> Can you elaborate a bit, please? Why does the case of filenames
>> matter for the queries it makes?
>>
>> AFAIU, it gets package names either from the user or from setup.py,
>> perhaps also from packages dependency inside .egg files (assuming
>> those support dependencies); these should all be case-sensitive.
>
> In order to resolve dependencies, the system looks at installed .egg
> files and directories (and .egg-info direcories), and extracts
> package name and version info from the filenames.

But the package name and version are in the PKG-INFO files, so it  
certainly has access to non-normalized names.  Why can't it double  
check a possible match against that file?

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From benji at benjiyork.com  Tue Jul 24 16:42:37 2007
From: benji at benjiyork.com (Benji York)
Date: Tue, 24 Jul 2007 10:42:37 -0400
Subject: [Catalog-sig] Prototype setuptools-specific PyPI index.
In-Reply-To: <46A4FA64.5050404@benjiyork.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	<46A23BAE.5090907@v.loewis.de>	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>	<46A259C4.6090605@v.loewis.de>	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>	<46A384C9.8040404@v.loewis.de>	<E31628FD-9208-4E22-88FF-BAAB86956994@zope.com>	<46A38DF5.6010701@v.loewis.de>
	<46A4FA64.5050404@benjiyork.com>
Message-ID: <46A60FDD.2030207@benjiyork.com>

Benji York wrote:
> I plan to do similar timings with the "simple" PyPI interface when I
> get a chance and report the results here.

Here are my non-scientific results:

buildout times:

regular: 4:52.86
simple: 3:15.57
ppix: 2:03.58

As everyone is aware, network latency has a large impact on this so here 
are the shortest round-trip packet times I got (with a small sample).

cheeseshop.python.org: 93ms
download.zope.org: 8ms

I suspect the majority/entirety of the difference between ppix and 
simple is network related.
-- 
Benji York
http://benjiyork.com

From pje at telecommunity.com  Tue Jul 24 17:31:19 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 24 Jul 2007 11:31:19 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
 index.
In-Reply-To: <0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
	<46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de>
	<20070723204446.294223A40B2@sparrow.telecommunity.com>
	<46A51A0C.2090800@v.loewis.de>
	<20070723211919.361E23A403D@sparrow.telecommunity.com>
	<0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com>
Message-ID: <20070724153349.CAF4B3A40AE@sparrow.telecommunity.com>

At 06:11 AM 7/24/2007 -0400, Jim Fulton wrote:

>On Jul 23, 2007, at 5:21 PM, Phillip J. Eby wrote:
>
>>At 11:13 PM 7/23/2007 +0200, Martin v. L?wis wrote:
>>>>Yes, especially since compatibility with the existing installation
>>>>base requires case insensitivity, because on case-insensitive
>>>>platforms easy_install already normalizes the case of filenames it
>>>>creates.  So, the question of what the "right thing" to do is in the
>>>>abstract has already been moot for a year or two.
>>>
>>>Can you elaborate a bit, please? Why does the case of filenames
>>>matter for the queries it makes?
>>>
>>>AFAIU, it gets package names either from the user or from setup.py,
>>>perhaps also from packages dependency inside .egg files (assuming
>>>those support dependencies); these should all be case-sensitive.
>>
>>In order to resolve dependencies, the system looks at installed .egg
>>files and directories (and .egg-info direcories), and extracts
>>package name and version info from the filenames.
>
>But the package name and version are in the PKG-INFO files, so it
>certainly has access to non-normalized names.  Why can't it double
>check a possible match against that file?

Because if case actually made a difference, we couldn't have both 
packages installed in the same directory, could we?  And why add an 
extra file open (which currently is only needed for "develop" eggs) 
to the process of building a working set or environment, in order to 
confirm something whose only purpose is to make requirements more 
difficult to specify?  :)

Note that if what's bothering you is the package index access time, 
use Apache's mod_speling to enable case-insensitive URLs for the 
static page tree.


From jim at zope.com  Tue Jul 24 17:39:38 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 24 Jul 2007 11:39:38 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <20070724153349.CAF4B3A40AE@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
	<46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de>
	<20070723204446.294223A40B2@sparrow.telecommunity.com>
	<46A51A0C.2090800@v.loewis.de>
	<20070723211919.361E23A403D@sparrow.telecommunity.com>
	<0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com>
	<20070724153349.CAF4B3A40AE@sparrow.telecommunity.com>
Message-ID: <37BB633B-B2E6-43BB-AB16-CFE807CF8625@zope.com>


On Jul 24, 2007, at 11:31 AM, Phillip J. Eby wrote:

> At 06:11 AM 7/24/2007 -0400, Jim Fulton wrote:
>
>> On Jul 23, 2007, at 5:21 PM, Phillip J. Eby wrote:
>>
>>> At 11:13 PM 7/23/2007 +0200, Martin v. L?wis wrote:
>>>>> Yes, especially since compatibility with the existing installation
>>>>> base requires case insensitivity, because on case-insensitive
>>>>> platforms easy_install already normalizes the case of filenames it
>>>>> creates.  So, the question of what the "right thing" to do is  
>>>>> in the
>>>>> abstract has already been moot for a year or two.
>>>>
>>>> Can you elaborate a bit, please? Why does the case of filenames
>>>> matter for the queries it makes?
>>>>
>>>> AFAIU, it gets package names either from the user or from setup.py,
>>>> perhaps also from packages dependency inside .egg files (assuming
>>>> those support dependencies); these should all be case-sensitive.
>>>
>>> In order to resolve dependencies, the system looks at installed .egg
>>> files and directories (and .egg-info direcories), and extracts
>>> package name and version info from the filenames.
>>
>> But the package name and version are in the PKG-INFO files, so it
>> certainly has access to non-normalized names.  Why can't it double
>> check a possible match against that file?
>
> Because if case actually made a difference, we couldn't have both  
> packages installed in the same directory, could we?  And why add an  
> extra file open (which currently is only needed for "develop" eggs)  
> to the process of building a working set or environment, in order  
> to confirm something whose only purpose is to make requirements  
> more difficult to specify?  :)

Currently, we allow packages to differ only in case.  The fact that  
setuptools pretends we don't doesn't change the fact that we do.  You  
said  that "compatibility with the existing installation base  
requires case insensitivity, because on case-insensitive platforms  
easy_install already normalizes the case of filenames it creates".  
I'm merely pointing out that we don't have to rely soley on the file  
name.

> Note that if what's bothering you is the package index access time,  
> use Apache's mod_speling to enable case-insensitive URLs for the  
> static page tree.

*If* we decide that package names are case insensitive, then we  
should do this. We haven't decided this.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From pje at telecommunity.com  Tue Jul 24 17:54:32 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 24 Jul 2007 11:54:32 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
 index.
In-Reply-To: <37BB633B-B2E6-43BB-AB16-CFE807CF8625@zope.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
	<46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de>
	<20070723204446.294223A40B2@sparrow.telecommunity.com>
	<46A51A0C.2090800@v.loewis.de>
	<20070723211919.361E23A403D@sparrow.telecommunity.com>
	<0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com>
	<20070724153349.CAF4B3A40AE@sparrow.telecommunity.com>
	<37BB633B-B2E6-43BB-AB16-CFE807CF8625@zope.com>
Message-ID: <20070724155212.E16A93A40A7@sparrow.telecommunity.com>

At 11:39 AM 7/24/2007 -0400, Jim Fulton wrote:

>On Jul 24, 2007, at 11:31 AM, Phillip J. Eby wrote:
>
>>At 06:11 AM 7/24/2007 -0400, Jim Fulton wrote:
>>
>>>On Jul 23, 2007, at 5:21 PM, Phillip J. Eby wrote:
>>>
>>>>At 11:13 PM 7/23/2007 +0200, Martin v. L?wis wrote:
>>>>>>Yes, especially since compatibility with the existing installation
>>>>>>base requires case insensitivity, because on case-insensitive
>>>>>>platforms easy_install already normalizes the case of filenames it
>>>>>>creates.  So, the question of what the "right thing" to do is
>>>>>>in the
>>>>>>abstract has already been moot for a year or two.
>>>>>
>>>>>Can you elaborate a bit, please? Why does the case of filenames
>>>>>matter for the queries it makes?
>>>>>
>>>>>AFAIU, it gets package names either from the user or from setup.py,
>>>>>perhaps also from packages dependency inside .egg files (assuming
>>>>>those support dependencies); these should all be case-sensitive.
>>>>
>>>>In order to resolve dependencies, the system looks at installed .egg
>>>>files and directories (and .egg-info direcories), and extracts
>>>>package name and version info from the filenames.
>>>
>>>But the package name and version are in the PKG-INFO files, so it
>>>certainly has access to non-normalized names.  Why can't it double
>>>check a possible match against that file?
>>
>>Because if case actually made a difference, we couldn't have both
>>packages installed in the same directory, could we?  And why add an
>>extra file open (which currently is only needed for "develop" eggs)
>>to the process of building a working set or environment, in order
>>to confirm something whose only purpose is to make requirements
>>more difficult to specify?  :)
>
>Currently, we allow packages to differ only in case.  The fact that
>setuptools pretends we don't doesn't change the fact that we do.

I wasn't under the impression that we were discussing whether 
allowing project names to differ only in case was a good idea, since 
I haven't heard anybody give an argument that it's a *good* idea.  In 
fact, it seems like an obviously bad idea on its face, whether 
setuptools is in the picture or not.


>>Note that if what's bothering you is the package index access time,
>>use Apache's mod_speling to enable case-insensitive URLs for the
>>static page tree.
>
>*If* we decide that package names are case insensitive, then we
>should do this. We haven't decided this.

Well, so far the only argument *against* it that I recall seeing, is 
your argument that sloppy requirement specs slow everybody down by 
making them do the extra package index hit.  So, if that's fixable, 
what other argument is there for treating the names case-sensitively?


From jim at zope.com  Tue Jul 24 18:36:29 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 24 Jul 2007 12:36:29 -0400
Subject: [Catalog-sig] We need to make a decision wrt distribution names
Message-ID: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com>


Obviously, we are having a debate about what forms distribution names  
can take.  I think we need a decision.

Does anyone know if there are existing rules for package names?  I  
can't find them if there are.  Up until now, I think we've been in  
somewhat of a prototyping mode, but I think it's time to move beyond  
that.

I strongly suggest that we need an official specification that says:

- what's a legal package name and

- what the equivalence rules for package names are.

Whatever we decide needs to be well supported by setuptools and  
PyPI.  I can live with whatever we decide as long as we decide  
something and make sure it is well communicated and implemented. In  
particular, I could live with the equivalence rules that setuptools  
uses if they are documented and if they are supported correctly and  
efficiently by the index (including mirrors).

IMO, a decision is extremely important.  If we can't reach consensus,  
then we need to call in the BDFL.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From waterbug at pangalactic.us  Tue Jul 24 18:49:43 2007
From: waterbug at pangalactic.us (Stephen Waterbury)
Date: Tue, 24 Jul 2007 12:49:43 -0400
Subject: [Catalog-sig] [Distutils] We need to make a decision wrt
	distribution names
In-Reply-To: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com>
References: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com>
Message-ID: <46A62DA7.9000304@pangalactic.us>

Jim Fulton wrote:
> Does anyone know if there are existing rules for package names?  I  
> can't find them if there are.  ...

Well, there is PEP 8, which has this to say on the subject:

"Package and Module Names

       "Modules should have short, all-lowercase names.  Underscores
        can be used in the module name if it improves readability.
        Python packages should also have short, all-lowercase names,
        although the use of underscores is discouraged."

Steve

From jim at zope.com  Tue Jul 24 18:54:36 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 24 Jul 2007 12:54:36 -0400
Subject: [Catalog-sig] We need to make a decision wrt distribution names
In-Reply-To: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com>
References: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com>
Message-ID: <41DE9981-BDAE-4A35-B989-DAD4749CA6BD@zope.com>


On Jul 24, 2007, at 12:36 PM, Jim Fulton wrote:

>
> Obviously, we are having a debate about what forms distribution  
> names can take.  I think we need a decision.
>
> Does anyone know if there are existing rules for package names?

Doh. I meant "distribution names". Sorry.

> I can't find them if there are.  Up until now, I think we've been  
> in somewhat of a prototyping mode, but I think it's time to move  
> beyond that.
>
> I strongly suggest that we need an official specification that says:
>
> - what's a legal package name and

Ditto

>
> - what the equivalence rules for package names are.

Ditto.

>
> Whatever we decide needs to be well supported by setuptools and  
> PyPI.  I can live with whatever we decide as long as we decide  
> something and make sure it is well communicated and implemented. In  
> particular, I could live with the equivalence rules that setuptools  
> uses if they are documented and if they are supported correctly and  
> efficiently by the index (including mirrors).
>
> IMO, a decision is extremely important.  If we can't reach  
> consensus, then we need to call in the BDFL.

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Tue Jul 24 18:55:39 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 24 Jul 2007 12:55:39 -0400
Subject: [Catalog-sig] We need to make a decision wrt distribution names
	(second try)
Message-ID: <ACAAE4EF-1EA0-401D-9F27-E652D98A971B@zope.com>


Obviously, we are having a debate about what forms distribution names  
can take.  I think we need a decision.

Does anyone know if there are existing rules for distribution names?   
I can't find them if there are.  Up until now, I think we've been in  
somewhat of a prototyping mode, but I think it's time to move beyond  
that.

I strongly suggest that we need an official specification that says:

- what's a legal distribution name and

- what the equivalence rules for distribution names are.

Whatever we decide needs to be well supported by setuptools and  
PyPI.  I can live with whatever we decide as long as we decide  
something and make sure it is well communicated and implemented. In  
particular, I could live with the equivalence rules that setuptools  
uses if they are documented and if they are supported correctly and  
efficiently by the index (including mirrors).

IMO, a decision is extremely important.  If we can't reach consensus,  
then we need to call in the BDFL.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Tue Jul 24 18:57:01 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 24 Jul 2007 12:57:01 -0400
Subject: [Catalog-sig] [Distutils] We need to make a decision wrt
	distribution names
In-Reply-To: <46A62DA7.9000304@pangalactic.us>
References: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com>
	<46A62DA7.9000304@pangalactic.us>
Message-ID: <DD07AC04-3A41-4799-98AC-73F92B3FE50D@zope.com>


On Jul 24, 2007, at 12:49 PM, Stephen Waterbury wrote:

> Jim Fulton wrote:
>> Does anyone know if there are existing rules for package names?  I
>> can't find them if there are.  ...
>
> Well, there is PEP 8, which has this to say on the subject:
>
> "Package and Module Names
>
>        "Modules should have short, all-lowercase names.  Underscores
>         can be used in the module name if it improves readability.
>         Python packages should also have short, all-lowercase names,
>         although the use of underscores is discouraged."

Doh, I was sloppy in my terminology. I should have said "distribution  
name".  We're talking about the names used in PyPI, the Python  
Distribution index. ;)  Also the value passed to the "name" argument  
of setup.

Sorry for the confusion.  :)

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From waterbug at pangalactic.us  Tue Jul 24 19:09:20 2007
From: waterbug at pangalactic.us (Stephen Waterbury)
Date: Tue, 24 Jul 2007 13:09:20 -0400
Subject: [Catalog-sig] [Distutils] We need to make a decision wrt
 distribution names
In-Reply-To: <DD07AC04-3A41-4799-98AC-73F92B3FE50D@zope.com>
References: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com>
	<46A62DA7.9000304@pangalactic.us>
	<DD07AC04-3A41-4799-98AC-73F92B3FE50D@zope.com>
Message-ID: <46A63240.7070003@pangalactic.us>

Jim Fulton wrote:
> 
> On Jul 24, 2007, at 12:49 PM, Stephen Waterbury wrote:
> 
>> Jim Fulton wrote:
>>> Does anyone know if there are existing rules for package names?  I
>>> can't find them if there are.  ...
>>
>> Well, there is PEP 8, which has this to say on the subject:
>>
>> "Package and Module Names
>>
>>        "Modules should have short, all-lowercase names.  Underscores
>>         can be used in the module name if it improves readability.
>>         Python packages should also have short, all-lowercase names,
>>         although the use of underscores is discouraged."
> 
> Doh, I was sloppy in my terminology. I should have said "distribution 
> name".  We're talking about the names used in PyPI, the Python 
> Distribution index. ;)  Also the value passed to the "name" argument of 
> setup.
> 
> Sorry for the confusion.  :)

Actually, I wasn't confused.  :)  I'd suggest a convention that allows
a distribution "title" (e.g., "Zope", "Twisted", etc.) and a
distribution "name" that would simply be the name of the
distribution's top-level package (e.g., "zope", "twisted", etc.), which
should follow the PEP 8 suggestion for package names and should be what
setuptools uses together with a version reference to uniquely
identify a specific distribution/version (egg).

Steve

From martin at v.loewis.de  Tue Jul 24 19:29:06 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 24 Jul 2007 19:29:06 +0200
Subject: [Catalog-sig] We need to make a decision wrt distribution names
In-Reply-To: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com>
References: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com>
Message-ID: <46A636E2.2090408@v.loewis.de>

> I strongly suggest that we need an official specification that says:

The process would then be to write a PEP. It will end with a BDFL
pronouncement either way, but that might be easy to obtain if there
is consensus up-front.

Regards,
Martin

From jim at zope.com  Tue Jul 24 19:33:43 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 24 Jul 2007 13:33:43 -0400
Subject: [Catalog-sig] [Distutils] We need to make a decision wrt
	distribution names (second try)
In-Reply-To: <46A636DB.50105@ibp.de>
References: <ACAAE4EF-1EA0-401D-9F27-E652D98A971B@zope.com>
	<46A636DB.50105@ibp.de>
Message-ID: <AB302A9F-A705-48E4-B6B1-2CCEEC6CADE1@zope.com>


On Jul 24, 2007, at 1:28 PM, Lars Immisch wrote:

> Hi,
>
>> Obviously, we are having a debate about what forms distribution  
>> names  can take.  I think we need a decision.
>
> Thanks for bringing this up.
>
>> Does anyone know if there are existing rules for distribution  
>> names?   I can't find them if there are.  Up until now, I think  
>> we've been in  somewhat of a prototyping mode, but I think it's  
>> time to move beyond  that.
>> I strongly suggest that we need an official specification that says:
>> - what's a legal distribution name and
>> - what the equivalence rules for distribution names are.
>
> Comparison rules are also important:
>
> Is artin-1.2-rc2 < artin-1.2?

Note that these are not distribution names.  Well, that depends on  
how you define "distribution names". Sigh.  The dsitribution names  
I'm trying to talk about don't have version numbers.  I don't see a  
particular reason why these distribution names have to be ordered,

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From martin at v.loewis.de  Tue Jul 24 19:40:55 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 24 Jul 2007 19:40:55 +0200
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
 index.
In-Reply-To: <20070724153349.CAF4B3A40AE@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
	<46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de>
	<20070723204446.294223A40B2@sparrow.telecommunity.com>
	<46A51A0C.2090800@v.loewis.de>
	<20070723211919.361E23A403D@sparrow.telecommunity.com>
	<0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com>
	<20070724153349.CAF4B3A40AE@sparrow.telecommunity.com>
Message-ID: <46A639A7.7090305@v.loewis.de>

>> But the package name and version are in the PKG-INFO files, so it
>> certainly has access to non-normalized names.  Why can't it double
>> check a possible match against that file?
> 
> Because if case actually made a difference, we couldn't have both
> packages installed in the same directory, could we?

Right. However, there is a difference between case-insensitive,
and case-preserving.

> Note that if what's bothering you is the package index access time, use
> Apache's mod_speling to enable case-insensitive URLs for the static page
> tree.

That won't help. If you look for a name of a non-registered package,
setuptools will go to the index even if mod_speling corrects spelling
errors.

Such an approach is only possible if setuptools would stop using
the entire index if the server has case-insensitive lookup (which
it cannot determine).

Regards,
Martin


From pje at telecommunity.com  Tue Jul 24 19:45:08 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 24 Jul 2007 13:45:08 -0400
Subject: [Catalog-sig] [Distutils] We need to make a decision wrt
 distribution names
In-Reply-To: <46A63240.7070003@pangalactic.us>
References: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com>
	<46A62DA7.9000304@pangalactic.us>
	<DD07AC04-3A41-4799-98AC-73F92B3FE50D@zope.com>
	<46A63240.7070003@pangalactic.us>
Message-ID: <20070724174248.F40AA3A40A7@sparrow.telecommunity.com>

At 01:09 PM 7/24/2007 -0400, Stephen Waterbury wrote:
>Actually, I wasn't confused.  :)  I'd suggest a convention that allows
>a distribution "title" (e.g., "Zope", "Twisted", etc.) and a
>distribution "name" that would simply be the name of the
>distribution's top-level package (e.g., "zope", "twisted", etc.),

This proposal would rule out namespace packages, in addition to being 
incompatible with existing distribution names.

Note that package != distribution -- a distribution may contain zero 
or more packages (even top-level), *and* a single package (top-level 
or otherwise) may be spread over more than one distribution.

Also note that this was true even with the distutils, long before 
setuptools existed.


From pje at telecommunity.com  Tue Jul 24 19:48:54 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 24 Jul 2007 13:48:54 -0400
Subject: [Catalog-sig] [Distutils] We need to make a decision wrt
 distribution names (second try)
In-Reply-To: <AB302A9F-A705-48E4-B6B1-2CCEEC6CADE1@zope.com>
References: <ACAAE4EF-1EA0-401D-9F27-E652D98A971B@zope.com>
	<46A636DB.50105@ibp.de>
	<AB302A9F-A705-48E4-B6B1-2CCEEC6CADE1@zope.com>
Message-ID: <20070724174634.2283E3A40A7@sparrow.telecommunity.com>

At 01:33 PM 7/24/2007 -0400, Jim Fulton wrote:
>Note that these are not distribution names.  Well, that depends on
>how you define "distribution names". Sigh.  The dsitribution names
>I'm trying to talk about don't have version numbers.

Setuptools uses the term "project name" for what you're calling a 
distribution name, if that helps any.  :)


From pje at telecommunity.com  Tue Jul 24 19:52:32 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 24 Jul 2007 13:52:32 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
 index.
In-Reply-To: <46A639A7.7090305@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
	<46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de>
	<20070723204446.294223A40B2@sparrow.telecommunity.com>
	<46A51A0C.2090800@v.loewis.de>
	<20070723211919.361E23A403D@sparrow.telecommunity.com>
	<0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com>
	<20070724153349.CAF4B3A40AE@sparrow.telecommunity.com>
	<46A639A7.7090305@v.loewis.de>
Message-ID: <20070724175013.238323A40A7@sparrow.telecommunity.com>

At 07:40 PM 7/24/2007 +0200, Martin v. L?wis wrote:
> >> But the package name and version are in the PKG-INFO files, so it
> >> certainly has access to non-normalized names.  Why can't it double
> >> check a possible match against that file?
> >
> > Because if case actually made a difference, we couldn't have both
> > packages installed in the same directory, could we?
>
>Right. However, there is a difference between case-insensitive,
>and case-preserving.

I don't understand your statement here, nor what is supposed to follow from it.


> > Note that if what's bothering you is the package index access time, use
> > Apache's mod_speling to enable case-insensitive URLs for the static page
> > tree.
>
>That won't help. If you look for a name of a non-registered package,
>setuptools will go to the index even if mod_speling corrects spelling
>errors.

Jim's objection was that if it's possible to get case-correction from 
the index, people will declare setup.py dependencies with incorrect 
case, leading to other packages having indirect dependencies with 
incorrect case, leading to lots of package index lookups.

This objection is relevant only to requirements which differ from the 
actual project name only by their case.  A non-registered package 
lookup is going to fail no matter what, and thus isn't going to wind 
up in a setup.py without a dependency_links specifier that will 
prevent it being looked up in the package index to begin with.


From jim at zope.com  Tue Jul 24 19:51:31 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 24 Jul 2007 13:51:31 -0400
Subject: [Catalog-sig] [Distutils] We need to make a decision wrt
	distribution names (second try)
In-Reply-To: <20070724174634.2283E3A40A7@sparrow.telecommunity.com>
References: <ACAAE4EF-1EA0-401D-9F27-E652D98A971B@zope.com>
	<46A636DB.50105@ibp.de>
	<AB302A9F-A705-48E4-B6B1-2CCEEC6CADE1@zope.com>
	<20070724174634.2283E3A40A7@sparrow.telecommunity.com>
Message-ID: <0A0EDDEC-7BCC-41C1-822A-5D93AF20E1F7@zope.com>


On Jul 24, 2007, at 1:48 PM, Phillip J. Eby wrote:

> At 01:33 PM 7/24/2007 -0400, Jim Fulton wrote:
>> Note that these are not distribution names.  Well, that depends on
>> how you define "distribution names". Sigh.  The dsitribution names
>> I'm trying to talk about don't have version numbers.
>
> Setuptools uses the term "project name" for what you're calling a  
> distribution name, if that helps any.  :)

Right.  I'm happy to use that. Does anyone want to disagree?

BTW, to up the ante, I volunteer to try to update the distutils  
document to reflect what we decide.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From lars at ibp.de  Tue Jul 24 19:28:59 2007
From: lars at ibp.de (Lars Immisch)
Date: Tue, 24 Jul 2007 19:28:59 +0200
Subject: [Catalog-sig] [Distutils] We need to make a decision wrt
 distribution names (second try)
In-Reply-To: <ACAAE4EF-1EA0-401D-9F27-E652D98A971B@zope.com>
References: <ACAAE4EF-1EA0-401D-9F27-E652D98A971B@zope.com>
Message-ID: <46A636DB.50105@ibp.de>

Hi,

> Obviously, we are having a debate about what forms distribution names  
> can take.  I think we need a decision.

Thanks for bringing this up.

> Does anyone know if there are existing rules for distribution names?   
> I can't find them if there are.  Up until now, I think we've been in  
> somewhat of a prototyping mode, but I think it's time to move beyond  
> that.
> 
> I strongly suggest that we need an official specification that says:
> 
> - what's a legal distribution name and
> 
> - what the equivalence rules for distribution names are.

Comparison rules are also important:

Is artin-1.2-rc2 < artin-1.2?

IMO, it's perfectly fine to just state: comparisons are lexicographical 
(ASCII only). But I'd like to see this mentioned somewhere.

- Lars

From lars at ibp.de  Tue Jul 24 20:11:57 2007
From: lars at ibp.de (Lars Immisch)
Date: Tue, 24 Jul 2007 20:11:57 +0200
Subject: [Catalog-sig] [Distutils] We need to make a decision wrt
 distribution names (second try)
In-Reply-To: <AB302A9F-A705-48E4-B6B1-2CCEEC6CADE1@zope.com>
References: <ACAAE4EF-1EA0-401D-9F27-E652D98A971B@zope.com>
	<46A636DB.50105@ibp.de>
	<AB302A9F-A705-48E4-B6B1-2CCEEC6CADE1@zope.com>
Message-ID: <46A640ED.8070406@ibp.de>

Hi,

>>> Obviously, we are having a debate about what forms distribution 
>>> names  can take.  I think we need a decision.
>>
>> Thanks for bringing this up.
>>
>>> Does anyone know if there are existing rules for distribution 
>>> names?   I can't find them if there are.  Up until now, I think we've 
>>> been in  somewhat of a prototyping mode, but I think it's time to 
>>> move beyond  that.
>>> I strongly suggest that we need an official specification that says:
>>> - what's a legal distribution name and
>>> - what the equivalence rules for distribution names are.
>>
>> Comparison rules are also important:
>>
>> Is artin-1.2-rc2 < artin-1.2?
> 
> Note that these are not distribution names.  Well, that depends on how 
> you define "distribution names". Sigh.  The dsitribution names I'm 
> trying to talk about don't have version numbers.  I don't see a 
> particular reason why these distribution names have to be ordered,

I see. Sorry for the drive-by-shooting.

Still, I'd like a stated convention how version numbers are compared. I 
believe this would be good for setuptools also.

But the issue is separable from project naming conventions.

- Lars

From martin at v.loewis.de  Tue Jul 24 20:21:11 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 24 Jul 2007 20:21:11 +0200
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
 index.
In-Reply-To: <20070724175013.238323A40A7@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
	<46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de>
	<20070723204446.294223A40B2@sparrow.telecommunity.com>
	<46A51A0C.2090800@v.loewis.de>
	<20070723211919.361E23A403D@sparrow.telecommunity.com>
	<0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com>
	<20070724153349.CAF4B3A40AE@sparrow.telecommunity.com>
	<46A639A7.7090305@v.loewis.de>
	<20070724175013.238323A40A7@sparrow.telecommunity.com>
Message-ID: <46A64317.6090907@v.loewis.de>

>> > Because if case actually made a difference, we couldn't have both
>> > packages installed in the same directory, could we?
>>
>> Right. However, there is a difference between case-insensitive,
>> and case-preserving.
> 
> I don't understand your statement here, nor what is supposed to follow
> from it.

Clearly, on a case-insensitive file system, project names differing
only in case cannot coexist. That doesn't mean that all references
to the project should be case-normalized (e.g. lower-cased).

So even if project names compare case-insensitive, there still
should (could) be a "right" spelling, the one that the package
author wants to see. This is the spelling that others then should
use.

So I still don't see why the file names on disk have any effect
on the lookup setuptools do to the index.

> Jim's objection was that if it's possible to get case-correction from
> the index, people will declare setup.py dependencies with incorrect
> case, leading to other packages having indirect dependencies with
> incorrect case, leading to lots of package index lookups.

I don't think that was his objection. IIUC, he complains about
incorrect spellings as bad, period - regardless of whether they also
have a performance effect. It's like spelling your name "Philipp" -
that's a bad thing to do, independent of whether it also makes you
harder to find (which it actually doesn't, thanks to Google).

> This objection is relevant only to requirements which differ from the
> actual project name only by their case.  A non-registered package lookup
> is going to fail no matter what, and thus isn't going to wind up in a
> setup.py without a dependency_links specifier that will prevent it being
> looked up in the package index to begin with.

Right. However, if setuptools would stop making case insensitive
lookups to the index, lookups to unregistered packages would become
more efficient.

Regards,
Martin


From pje at telecommunity.com  Tue Jul 24 20:44:08 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 24 Jul 2007 14:44:08 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
 index.
In-Reply-To: <46A64317.6090907@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
	<46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de>
	<20070723204446.294223A40B2@sparrow.telecommunity.com>
	<46A51A0C.2090800@v.loewis.de>
	<20070723211919.361E23A403D@sparrow.telecommunity.com>
	<0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com>
	<20070724153349.CAF4B3A40AE@sparrow.telecommunity.com>
	<46A639A7.7090305@v.loewis.de>
	<20070724175013.238323A40A7@sparrow.telecommunity.com>
	<46A64317.6090907@v.loewis.de>
Message-ID: <20070724184151.1EAE53A40A7@sparrow.telecommunity.com>

At 08:21 PM 7/24/2007 +0200, Martin v. L?wis wrote:
> >> > Because if case actually made a difference, we couldn't have both
> >> > packages installed in the same directory, could we?
> >>
> >> Right. However, there is a difference between case-insensitive,
> >> and case-preserving.
> >
> > I don't understand your statement here, nor what is supposed to follow
> > from it.
>
>Clearly, on a case-insensitive file system, project names differing
>only in case cannot coexist. That doesn't mean that all references
>to the project should be case-normalized (e.g. lower-cased).
>
>So even if project names compare case-insensitive, there still
>should (could) be a "right" spelling, the one that the package
>author wants to see. This is the spelling that others then should
>use.

Well, that spelling will certainly show up everywhere.  Setuptools is 
case-preserving, *except* with regard to installing egg files on 
case-insensitive filesystems (as defined by what os.path.normcase 
does on a given platform).  When it installs an egg, it normalizes 
the case of the target path.  In all other matters it is 
case-insensitive for comparison, but case-preserving of the inputs it receives.


> > Jim's objection was that if it's possible to get case-correction from
> > the index, people will declare setup.py dependencies with incorrect
> > case, leading to other packages having indirect dependencies with
> > incorrect case, leading to lots of package index lookups.
>
>I don't think that was his objection. IIUC, he complains about
>incorrect spellings as bad, period - regardless of whether they also
>have a performance effect. It's like spelling your name "Philipp" -
>that's a bad thing to do, independent of whether it also makes you
>harder to find (which it actually doesn't, thanks to Google).

It's actually more like spelling my name "phillip", which is arguably 
still spelled correctly, if punctuated poorly.  :)

And it's also an answer to the wrong question: the *first* question 
is whether we should allow "phillip" and "Phillip" to co-exist in the 
package index.  If not, then there is the question of whether there 
is any reason to be case-sensitive with respect to searching.

If we are agreed that having projects whose names differ only by case 
is a bad idea, then the latter question is considerably less controversial.


> > This objection is relevant only to requirements which differ from the
> > actual project name only by their case.  A non-registered package lookup
> > is going to fail no matter what, and thus isn't going to wind up in a
> > setup.py without a dependency_links specifier that will prevent it being
> > looked up in the package index to begin with.
>
>Right. However, if setuptools would stop making case insensitive
>lookups to the index, lookups to unregistered packages would become
>more efficient.

I'm not sure I follow you.  If a non-registered package is used as a 
dependency, the setup() will need to specify dependency_links, in 
which case PyPI will not be consulted.


From martin at v.loewis.de  Tue Jul 24 20:54:24 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 24 Jul 2007 20:54:24 +0200
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
 index.
In-Reply-To: <20070724184151.1EAE53A40A7@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
	<46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de>
	<20070723204446.294223A40B2@sparrow.telecommunity.com>
	<46A51A0C.2090800@v.loewis.de>
	<20070723211919.361E23A403D@sparrow.telecommunity.com>
	<0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com>
	<20070724153349.CAF4B3A40AE@sparrow.telecommunity.com>
	<46A639A7.7090305@v.loewis.de>
	<20070724175013.238323A40A7@sparrow.telecommunity.com>
	<46A64317.6090907@v.loewis.de>
	<20070724184151.1EAE53A40A7@sparrow.telecommunity.com>
Message-ID: <46A64AE0.7000307@v.loewis.de>

>> Right. However, if setuptools would stop making case insensitive
>> lookups to the index, lookups to unregistered packages would become
>> more efficient.
> 
> I'm not sure I follow you.  If a non-registered package is used as a
> dependency, the setup() will need to specify dependency_links, in which
> case PyPI will not be consulted.

Ah, ok. So is it then correct that setuptools never looks at pypi/,
unless the user misspelled a package name on the command line?

Regards,
Martin


From jim at zope.com  Tue Jul 24 21:20:40 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 24 Jul 2007 15:20:40 -0400
Subject: [Catalog-sig] [Distutils] We need to make a decision wrt
	distribution names (second try)
In-Reply-To: <46A640ED.8070406@ibp.de>
References: <ACAAE4EF-1EA0-401D-9F27-E652D98A971B@zope.com>
	<46A636DB.50105@ibp.de>
	<AB302A9F-A705-48E4-B6B1-2CCEEC6CADE1@zope.com>
	<46A640ED.8070406@ibp.de>
Message-ID: <1FE480D2-27FD-4F42-82FD-06C387805EE2@zope.com>


On Jul 24, 2007, at 2:11 PM, Lars Immisch wrote:

> Hi,
>
>>>> Obviously, we are having a debate about what forms distribution  
>>>> names  can take.  I think we need a decision.
>>>
>>> Thanks for bringing this up.
>>>
>>>> Does anyone know if there are existing rules for distribution  
>>>> names?   I can't find them if there are.  Up until now, I think  
>>>> we've been in  somewhat of a prototyping mode, but I think it's  
>>>> time to move beyond  that.
>>>> I strongly suggest that we need an official specification that  
>>>> says:
>>>> - what's a legal distribution name and
>>>> - what the equivalence rules for distribution names are.
>>>
>>> Comparison rules are also important:
>>>
>>> Is artin-1.2-rc2 < artin-1.2?
>> Note that these are not distribution names.  Well, that depends on  
>> how you define "distribution names". Sigh.  The dsitribution names  
>> I'm trying to talk about don't have version numbers.  I don't see  
>> a particular reason why these distribution names have to be ordered,
>
> I see. Sorry for the drive-by-shooting.

np

> Still, I'd like a stated convention how version numbers are  
> compared. I believe this would be good for setuptools also.

setuptools has this.  It would be nice bless it in a PEP.

> But the issue is separable from project naming conventions.

Yup.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From pje at telecommunity.com  Tue Jul 24 21:46:42 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 24 Jul 2007 15:46:42 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
 index.
In-Reply-To: <46A64AE0.7000307@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<46A23BAE.5090907@v.loewis.de>
	<932E4411-77F7-42A2-BCFB-B7B5D005EBB6@zope.com>
	<46A259C4.6090605@v.loewis.de>
	<FD8BD3D0-C9F0-450E-9F64-CBAA1EB7A1CB@zope.com>
	<20070722164922.AE50D3A40A9@sparrow.telecommunity.com>
	<799F00B4-AEAB-446D-B45A-B96B089C6C2C@zope.com>
	<20070723152015.E7AFA3A403D@sparrow.telecommunity.com>
	<46A4D0BE.4030706@palladion.com> <46A509D4.3070108@v.loewis.de>
	<20070723204446.294223A40B2@sparrow.telecommunity.com>
	<46A51A0C.2090800@v.loewis.de>
	<20070723211919.361E23A403D@sparrow.telecommunity.com>
	<0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com>
	<20070724153349.CAF4B3A40AE@sparrow.telecommunity.com>
	<46A639A7.7090305@v.loewis.de>
	<20070724175013.238323A40A7@sparrow.telecommunity.com>
	<46A64317.6090907@v.loewis.de>
	<20070724184151.1EAE53A40A7@sparrow.telecommunity.com>
	<46A64AE0.7000307@v.loewis.de>
Message-ID: <20070724194426.828EE3A40A7@sparrow.telecommunity.com>

At 08:54 PM 7/24/2007 +0200, Martin v. L?wis wrote:
> >> Right. However, if setuptools would stop making case insensitive
> >> lookups to the index, lookups to unregistered packages would become
> >> more efficient.
> >
> > I'm not sure I follow you.  If a non-registered package is used as a
> > dependency, the setup() will need to specify dependency_links, in which
> > case PyPI will not be consulted.
>
>Ah, ok. So is it then correct that setuptools never looks at pypi/,
>unless the user misspelled a package name on the command line?

Pretty much, yes.


From martin at v.loewis.de  Tue Jul 24 21:55:47 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 24 Jul 2007 21:55:47 +0200
Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org
Message-ID: <46A65943.7000302@v.loewis.de>

After some discussion, it seems that nobody really likes
the name "cheeseshop" for the Python Package Index,
and some people seem to actively hate it.

So I'm going to change the name (again/back): the software
will call itself "Python Package Index", abbreviated as
pypi (PyPI where case matters). The machine address
cheeseshop.python.org will continue to work for a
foreseeable future, but will not be actively advertised.

Regards,
Martin

From noah.gift at gmail.com  Tue Jul 24 21:57:30 2007
From: noah.gift at gmail.com (Noah Gift)
Date: Tue, 24 Jul 2007 15:57:30 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <20070724194426.828EE3A40A7@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<20070723211919.361E23A403D@sparrow.telecommunity.com>
	<0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com>
	<20070724153349.CAF4B3A40AE@sparrow.telecommunity.com>
	<46A639A7.7090305@v.loewis.de>
	<20070724175013.238323A40A7@sparrow.telecommunity.com>
	<46A64317.6090907@v.loewis.de>
	<20070724184151.1EAE53A40A7@sparrow.telecommunity.com>
	<46A64AE0.7000307@v.loewis.de>
	<20070724194426.828EE3A40A7@sparrow.telecommunity.com>
Message-ID: <e91cc0270707241257g4d65e2fblfcbde19983adfa53@mail.gmail.com>

On 7/24/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 08:54 PM 7/24/2007 +0200, Martin v. L?wis wrote:
> > >> Right. However, if setuptools would stop making case insensitive
> > >> lookups to the index, lookups to unregistered packages would become
> > >> more efficient.
> > >
> > > I'm not sure I follow you.  If a non-registered package is used as a
> > > dependency, the setup() will need to specify dependency_links, in which
> > > case PyPI will not be consulted.
> >
> >Ah, ok. So is it then correct that setuptools never looks at pypi/,
> >unless the user misspelled a package name on the command line?
>
> Pretty much, yes.

Would it be a bad idea to suggest the case insensitive lookup happen
against a local flat file that gets diff'd from PyPI?  Then only the
culprit gets punished using their own CPU :)

From martin at v.loewis.de  Tue Jul 24 22:02:27 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 24 Jul 2007 22:02:27 +0200
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
 index.
In-Reply-To: <e91cc0270707241257g4d65e2fblfcbde19983adfa53@mail.gmail.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	
	<20070723211919.361E23A403D@sparrow.telecommunity.com>	
	<0CED8E91-F8C1-4951-A4C6-F7DDA81BE027@zope.com>	
	<20070724153349.CAF4B3A40AE@sparrow.telecommunity.com>	
	<46A639A7.7090305@v.loewis.de>	
	<20070724175013.238323A40A7@sparrow.telecommunity.com>	
	<46A64317.6090907@v.loewis.de>	
	<20070724184151.1EAE53A40A7@sparrow.telecommunity.com>	
	<46A64AE0.7000307@v.loewis.de>	
	<20070724194426.828EE3A40A7@sparrow.telecommunity.com>
	<e91cc0270707241257g4d65e2fblfcbde19983adfa53@mail.gmail.com>
Message-ID: <46A65AD3.3060607@v.loewis.de>

> Would it be a bad idea to suggest the case insensitive lookup happen
> against a local flat file that gets diff'd from PyPI?  Then only the
> culprit gets punished using their own CPU :)

What does it mean to "diff a flat file from PyPI"?

Regards,
Martin

From noah.gift at gmail.com  Tue Jul 24 22:07:15 2007
From: noah.gift at gmail.com (Noah Gift)
Date: Tue, 24 Jul 2007 16:07:15 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <46A65AD3.3060607@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<20070724153349.CAF4B3A40AE@sparrow.telecommunity.com>
	<46A639A7.7090305@v.loewis.de>
	<20070724175013.238323A40A7@sparrow.telecommunity.com>
	<46A64317.6090907@v.loewis.de>
	<20070724184151.1EAE53A40A7@sparrow.telecommunity.com>
	<46A64AE0.7000307@v.loewis.de>
	<20070724194426.828EE3A40A7@sparrow.telecommunity.com>
	<e91cc0270707241257g4d65e2fblfcbde19983adfa53@mail.gmail.com>
	<46A65AD3.3060607@v.loewis.de>
Message-ID: <e91cc0270707241307g482cbe9fn96e170820955a731@mail.gmail.com>

On 7/24/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > Would it be a bad idea to suggest the case insensitive lookup happen
> > against a local flat file that gets diff'd from PyPI?  Then only the
> > culprit gets punished using their own CPU :)
>
> What does it mean to "diff a flat file from PyPI"?
I am familiar with an open source project called Radmind.  It
maintains machines be keeping a local transcript with all of the files
and "overloads" on it.  When you modify the file system you diff the
changes into an overload and put them on the server.

When the client asks for an update, the client checks to see if its
transcript files are the same.  If they are then it does nothing.  If
it different the file(s) get updated.  Then the magic is that the
search and replace for which files it needs to grab are done locally
using ton's of local CPU.  When the client resolves all of the files
it needs, it then grabs them from the server.

It is a nifty design:  http://rsug.itd.umich.edu/software/radmind/

So, if someone does an "incorrect" search, easy_install checks to see
first if it has the latest "file".  If not, it then replaces its local
index.  Then the search happens locally, not being going back and
forth to the server.



>
> Regards,
> Martin
>


-- 
http://www.blog.noahgift.com

From pje at telecommunity.com  Tue Jul 24 22:21:29 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 24 Jul 2007 16:21:29 -0400
Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org
In-Reply-To: <46A65943.7000302@v.loewis.de>
References: <46A65943.7000302@v.loewis.de>
Message-ID: <20070724201911.1210E3A40A7@sparrow.telecommunity.com>

At 09:55 PM 7/24/2007 +0200, Martin v. L?wis wrote:
>After some discussion, it seems that nobody really likes
>the name "cheeseshop" for the Python Package Index,
>and some people seem to actively hate it.

I was under the impression that that's also the case for the name 
"PyPI", which was changed because of difficulty of disambiguating 
from "PyPy" in conversation.

Cheeseshop is at least a word that is obviously a noun, and it is in 
somewhat more common use, with 224000 google hits for "cheeseshop 
python -monty", versus 199,000 for "pypi python -monty".


From martin at v.loewis.de  Tue Jul 24 22:23:58 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 24 Jul 2007 22:23:58 +0200
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
 index.
In-Reply-To: <e91cc0270707241307g482cbe9fn96e170820955a731@mail.gmail.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>	
	<20070724153349.CAF4B3A40AE@sparrow.telecommunity.com>	
	<46A639A7.7090305@v.loewis.de>	
	<20070724175013.238323A40A7@sparrow.telecommunity.com>	
	<46A64317.6090907@v.loewis.de>	
	<20070724184151.1EAE53A40A7@sparrow.telecommunity.com>	
	<46A64AE0.7000307@v.loewis.de>	
	<20070724194426.828EE3A40A7@sparrow.telecommunity.com>	
	<e91cc0270707241257g4d65e2fblfcbde19983adfa53@mail.gmail.com>	
	<46A65AD3.3060607@v.loewis.de>
	<e91cc0270707241307g482cbe9fn96e170820955a731@mail.gmail.com>
Message-ID: <46A65FDE.7080806@v.loewis.de>

> I am familiar with an open source project called Radmind.  It
> maintains machines be keeping a local transcript with all of the files
> and "overloads" on it.  When you modify the file system you diff the
> changes into an overload and put them on the server.

That's still a lot of terminology which I don't understand, and have
no intuition for, perhaps because English is not my native language.
I give up trying to understand - just to give you an idea:
What's a "transcript of files"? How do you "overload" on it (why
is "to overload" used with the preposition "on")? How do I "diff" a
change "into" "an overload" (which now is a noun, it seems)?

> So, if someone does an "incorrect" search, easy_install checks to see
> first if it has the latest "file".  If not, it then replaces its local
> index.  Then the search happens locally, not being going back and
> forth to the server.

I think this brings us to the real issue: you asked whether this would
be a bad idea to suggest that? I now think "perhaps not bad, but
unhelpful, unless you also contribute an implementation of it".
It's a change to setuptools, which is still mostly a one-man-show,
(IIUC), so proposing ideas in general is futile (as for most software
with a single author - including PyPI); the single author cannot
possibly implement all the ideas people have.

Regards,
Martin

From jim at zope.com  Tue Jul 24 22:36:31 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 24 Jul 2007 16:36:31 -0400
Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org
In-Reply-To: <46A65943.7000302@v.loewis.de>
References: <46A65943.7000302@v.loewis.de>
Message-ID: <682F74FB-071E-4A82-8A90-E1ECC4A99E77@zope.com>


On Jul 24, 2007, at 3:55 PM, Martin v. L?wis wrote:

> After some discussion, it seems that nobody really likes
> the name "cheeseshop" for the Python Package Index,
> and some people seem to actively hate it.
>
> So I'm going to change the name (again/back): the software
> will call itself "Python Package Index", abbreviated as
> pypi (PyPI where case matters). The machine address
> cheeseshop.python.org will continue to work for a
> foreseeable future, but will not be actively advertised.

I think this is progress.

I'll note that "Package Index"  is somewhat misleading, because it  
actually indexes distributions, not packages.  A more precise name  
would be pydi and wouldn't be so easily confused with pypy.  (Jim  
ducks. Jim looks forward to pydi tee shirts.)

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From noah.gift at gmail.com  Tue Jul 24 22:37:49 2007
From: noah.gift at gmail.com (Noah Gift)
Date: Tue, 24 Jul 2007 16:37:49 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <46A65FDE.7080806@v.loewis.de>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<20070724175013.238323A40A7@sparrow.telecommunity.com>
	<46A64317.6090907@v.loewis.de>
	<20070724184151.1EAE53A40A7@sparrow.telecommunity.com>
	<46A64AE0.7000307@v.loewis.de>
	<20070724194426.828EE3A40A7@sparrow.telecommunity.com>
	<e91cc0270707241257g4d65e2fblfcbde19983adfa53@mail.gmail.com>
	<46A65AD3.3060607@v.loewis.de>
	<e91cc0270707241307g482cbe9fn96e170820955a731@mail.gmail.com>
	<46A65FDE.7080806@v.loewis.de>
Message-ID: <e91cc0270707241337r3ee5a9ccwb73286976007b3a7@mail.gmail.com>

My real motive is selfishness.  I like that easy_install in not case
sensitive, as I and other people I am helping to learn Python.  I just
hope that doesn't go away.  My suggestion is mored geared toward, how
do I "keep" that feature :)

> no intuition for, perhaps because English is not my native language.
> I give up trying to understand - just to give you an idea:

I apologize, I can be very lazy when I type.

> I now think "perhaps not bad, but
> unhelpful, unless you also contribute an implementation of it".
> It's a change to setuptools, which is still mostly a one-man-show,
> (IIUC), so proposing ideas in general is futile (as for most software
> with a single author - including PyPI); the single author cannot
> possibly implement all the ideas people have.

The basic algorithm is that a local index of PyPi could be kept in one
file.  If an incorrect search was made, the first action to occur
would be to check if the local file was the same as the file on the
server.  If not, it would sync the changes with svn.  Then
easy_install would try to do lookups against the local file to find a
match.

I am happy to help if you need help.  I am particular interest in
easy_install as I am writing a chapter on it for an O'Reilly book as
well, again a partially selfish motive :)

Noah
>
> Regards,
> Martin
>


-- 
http://www.blog.noahgift.com

From martin at v.loewis.de  Tue Jul 24 23:35:24 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 24 Jul 2007 23:35:24 +0200
Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org
In-Reply-To: <20070724201911.1210E3A40A7@sparrow.telecommunity.com>
References: <46A65943.7000302@v.loewis.de>
	<20070724201911.1210E3A40A7@sparrow.telecommunity.com>
Message-ID: <46A6709C.1040004@v.loewis.de>

>> After some discussion, it seems that nobody really likes
>> the name "cheeseshop" for the Python Package Index,
>> and some people seem to actively hate it.
> 
> I was under the impression that that's also the case for the name
> "PyPI", which was changed because of difficulty of disambiguating from
> "PyPy" in conversation.

That may be the case - however, Guido van Rossum said he would like
to see PyPI promoted, and thought that this already had been decided.
Richard Jones doesn't object; so PyPI it is.

> Cheeseshop is at least a word that is obviously a noun, and it is in
> somewhat more common use, with 224000 google hits for "cheeseshop python
> -monty", versus 199,000 for "pypi python -monty".

Sure. I can see all the reasons why one would like to have something
like that. However, it's an authority decision, and I firmly believe
in authority when it comes to naming things - somebody has to pick
a name, and PyPI is the name that got picked (along with its full
spelling of "Python Package Index" - google for that also)

But then, I can't even see why the number of hits is important - what
matters is what comes out at place 1 in Google.

Regards,
Martin

From philipp at weitershausen.de  Tue Jul 24 23:33:40 2007
From: philipp at weitershausen.de (Philipp von Weitershausen)
Date: Tue, 24 Jul 2007 23:33:40 +0200
Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org
In-Reply-To: <46A65943.7000302@v.loewis.de>
References: <46A65943.7000302@v.loewis.de>
Message-ID: <f85ra7$i1q$1@sea.gmane.org>

Martin v. L?wis wrote:
> After some discussion, it seems that nobody really likes
> the name "cheeseshop" for the Python Package Index,
> and some people seem to actively hate it.

Not sure if this makes a difference, but I'm one of the people who 
actively love it (whatever that means :)).

> So I'm going to change the name (again/back): the software
> will call itself "Python Package Index", abbreviated as
> pypi (PyPI where case matters). The machine address
> cheeseshop.python.org will continue to work for a
> foreseeable future, but will not be actively advertised.

To avoid confusion with PyPy, we can perhaps encourage a different 
pronounciation (or a totally different name as Jim has suggested). I've 
been pronouncing it "pippi" (as in Longstocking) myself.


-- 
http://worldcookery.com -- Professional Zope documentation and training


From martin at v.loewis.de  Tue Jul 24 23:43:20 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 24 Jul 2007 23:43:20 +0200
Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org
In-Reply-To: <682F74FB-071E-4A82-8A90-E1ECC4A99E77@zope.com>
References: <46A65943.7000302@v.loewis.de>
	<682F74FB-071E-4A82-8A90-E1ECC4A99E77@zope.com>
Message-ID: <46A67278.6000709@v.loewis.de>

> I'll note that "Package Index"  is somewhat misleading, because it
> actually indexes distributions, not packages.  A more precise name would
> be pydi and wouldn't be so easily confused with pypy.  (Jim ducks. Jim
> looks forward to pydi tee shirts.)

That's actually irrelevant. There is no option to name it something
*other* than either PyPI or Cheeseshop. Naming it something else
has been tried and failed tremendously, so I won't try it again
(and being the main PyPI maintainer at the moment, nobody else
has a chance to try something else - if you want to give it a
name, contribute to it for a few years, then have your own try).

FWIW, "distribution" is quite misleading. SuSE is a Distribution
(of Linux), and so is Debian; ActivePython is a distribution of
Python (I think - before coming to that area, I would have though
that "the distribution of Python varies across continents").

Regards,
Martin

From jim at zope.com  Tue Jul 24 23:49:15 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 24 Jul 2007 17:49:15 -0400
Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org
In-Reply-To: <46A67278.6000709@v.loewis.de>
References: <46A65943.7000302@v.loewis.de>
	<682F74FB-071E-4A82-8A90-E1ECC4A99E77@zope.com>
	<46A67278.6000709@v.loewis.de>
Message-ID: <304831B1-4E0E-4E54-9DD4-0CD3699DCEF2@zope.com>


On Jul 24, 2007, at 5:43 PM, Martin v. L?wis wrote:
...
> FWIW, "distribution" is quite misleading. SuSE is a Distribution
> (of Linux), and so is Debian; ActivePython is a distribution of
> Python (I think - before coming to that area, I would have though
> that "the distribution of Python varies across continents").

I didn't come up with the name "distribution". Distutils did that.   
Whether we like it or not, the Python Library Reference defines this  
term.

   http://docs.python.org/dist/distutils-term.html

We have a real problem with terminology.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From kantrn at rpi.edu  Tue Jul 24 23:47:43 2007
From: kantrn at rpi.edu (Noah Kantrowitz)
Date: Tue, 24 Jul 2007 17:47:43 -0400
Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org
In-Reply-To: <46A6709C.1040004@v.loewis.de>
References: <46A65943.7000302@v.loewis.de>
	<20070724201911.1210E3A40A7@sparrow.telecommunity.com>
	<46A6709C.1040004@v.loewis.de>
Message-ID: <AC6D9C6E-EFD7-466B-A074-107E5E3D2B14@rpi.edu>

On Jul 24, 2007, at 5:35 PM, Martin v. L?wis wrote:

>>> After some discussion, it seems that nobody really likes
>>> the name "cheeseshop" for the Python Package Index,
>>> and some people seem to actively hate it.
>>
>> I was under the impression that that's also the case for the name
>> "PyPI", which was changed because of difficulty of disambiguating  
>> from
>> "PyPy" in conversation.
>
> That may be the case - however, Guido van Rossum said he would like
> to see PyPI promoted, and thought that this already had been decided.
> Richard Jones doesn't object; so PyPI it is.
>
>> Cheeseshop is at least a word that is obviously a noun, and it is in
>> somewhat more common use, with 224000 google hits for "cheeseshop  
>> python
>> -monty", versus 199,000 for "pypi python -monty".
>
> Sure. I can see all the reasons why one would like to have something
> like that. However, it's an authority decision, and I firmly believe
> in authority when it comes to naming things - somebody has to pick
> a name, and PyPI is the name that got picked (along with its full
> spelling of "Python Package Index" - google for that also)
>
> But then, I can't even see why the number of hits is important - what
> matters is what comes out at place 1 in Google.
>

Personally I don't have much of a problem with PyPI (pie pee eye) vs.  
PyPy (pie pie).

--Noah

From martin at v.loewis.de  Wed Jul 25 00:07:17 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 25 Jul 2007 00:07:17 +0200
Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org
In-Reply-To: <304831B1-4E0E-4E54-9DD4-0CD3699DCEF2@zope.com>
References: <46A65943.7000302@v.loewis.de>
	<682F74FB-071E-4A82-8A90-E1ECC4A99E77@zope.com>
	<46A67278.6000709@v.loewis.de>
	<304831B1-4E0E-4E54-9DD4-0CD3699DCEF2@zope.com>
Message-ID: <46A67815.7020807@v.loewis.de>

> I didn't come up with the name "distribution". Distutils did that. 
> Whether we like it or not, the Python Library Reference defines this term.
> 
>   http://docs.python.org/dist/distutils-term.html
> 
> We have a real problem with terminology.

Perhaps. I notice that the page you refer to does *not* define
the term "distribution", but "module distribution". I also notice
that PyPI is not an index for these (i.e. .tar.gz or whatever
files containing Python modules). Instead, in *indexes* Python
projects (as Richard calls them, and I think quite correctly
so). Each project then may have multiple _releases_, and each
of them may refer to distributions (but not only so, it
also refers to a home page, an author, a description, Trove
classifiers, etc).

Regards,
Martin

FWIW, the distutils terminology would make it "PyMDI" :-)

From pje at telecommunity.com  Wed Jul 25 00:13:48 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 24 Jul 2007 18:13:48 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
 index.
In-Reply-To: <e91cc0270707241337r3ee5a9ccwb73286976007b3a7@mail.gmail.co
 m>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<20070724175013.238323A40A7@sparrow.telecommunity.com>
	<46A64317.6090907@v.loewis.de>
	<20070724184151.1EAE53A40A7@sparrow.telecommunity.com>
	<46A64AE0.7000307@v.loewis.de>
	<20070724194426.828EE3A40A7@sparrow.telecommunity.com>
	<e91cc0270707241257g4d65e2fblfcbde19983adfa53@mail.gmail.com>
	<46A65AD3.3060607@v.loewis.de>
	<e91cc0270707241307g482cbe9fn96e170820955a731@mail.gmail.com>
	<46A65FDE.7080806@v.loewis.de>
	<e91cc0270707241337r3ee5a9ccwb73286976007b3a7@mail.gmail.com>
Message-ID: <20070724221128.D9B193A40A7@sparrow.telecommunity.com>

At 04:37 PM 7/24/2007 -0400, Noah Gift wrote:
>The basic algorithm is that a local index of PyPi could be kept in one
>file.  If an incorrect search was made, the first action to occur
>would be to check if the local file was the same as the file on the
>server.  If not, it would sync the changes with svn.  Then
>easy_install would try to do lookups against the local file to find a
>match.

Note that there are a lot of ways you can implement something like 
this without even involving me on the client or Martin on the 
server.  For example, setuptools.package_index uses urllib2 for all 
its URL access, so installing an "opener" that does caching before 
invoking easy_install is possible.  You can also subclass the 
easy_install command class and the PackageIndex class, or tell the 
easy_install command class to use a different PackageIndex implementation.

In the long run, I'd like to add some entry points to allow people to 
extend the search mechanism in such ways, but for now you can 
certainly hack subclasses easily enough and make your own alternative 
commands, as Jim has done for integrating zc.buildout with setuptools.


From richardjones at optushome.com.au  Wed Jul 25 00:14:22 2007
From: richardjones at optushome.com.au (Richard Jones)
Date: Wed, 25 Jul 2007 08:14:22 +1000
Subject: [Catalog-sig] [Distutils] We need to make a decision wrt
	distribution names (second try)
In-Reply-To: <46A640ED.8070406@ibp.de>
References: <ACAAE4EF-1EA0-401D-9F27-E652D98A971B@zope.com>
	<AB302A9F-A705-48E4-B6B1-2CCEEC6CADE1@zope.com>
	<46A640ED.8070406@ibp.de>
Message-ID: <200707250814.22196.richardjones@optushome.com.au>

On Wed, 25 Jul 2007, Lars Immisch wrote:
> Still, I'd like a stated convention how version numbers are compared. I
> believe this would be good for setuptools also.

Currently PyPI sorts releases using distutils.version.LooseVersion

It uses distutils.version.StrictVersion when parsing "provides, "requires" 
and "obsoletes" setup.py package meta-data.


    Richard

From g.brandl at gmx.net  Wed Jul 25 00:15:22 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 25 Jul 2007 00:15:22 +0200
Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org
In-Reply-To: <46A67815.7020807@v.loewis.de>
References: <46A65943.7000302@v.loewis.de>	<682F74FB-071E-4A82-8A90-E1ECC4A99E77@zope.com>	<46A67278.6000709@v.loewis.de>	<304831B1-4E0E-4E54-9DD4-0CD3699DCEF2@zope.com>
	<46A67815.7020807@v.loewis.de>
Message-ID: <f85tlo$pdo$1@sea.gmane.org>

Martin v. L?wis schrieb:
>> I didn't come up with the name "distribution". Distutils did that. 
>> Whether we like it or not, the Python Library Reference defines this term.
>> 
>>   http://docs.python.org/dist/distutils-term.html
>> 
>> We have a real problem with terminology.
> 
> Perhaps. I notice that the page you refer to does *not* define
> the term "distribution", but "module distribution". I also notice
> that PyPI is not an index for these (i.e. .tar.gz or whatever
> files containing Python modules). Instead, in *indexes* Python
> projects (as Richard calls them, and I think quite correctly
> so). Each project then may have multiple _releases_, and each
> of them may refer to distributions (but not only so, it
> also refers to a home page, an author, a description, Trove
> classifiers, etc).

So why not change the name to "Python Project Index"? The abbreviation
stays the same...

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From benji at benjiyork.com  Tue Jul 24 22:47:05 2007
From: benji at benjiyork.com (Benji York)
Date: Tue, 24 Jul 2007 16:47:05 -0400
Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org
In-Reply-To: <682F74FB-071E-4A82-8A90-E1ECC4A99E77@zope.com>
References: <46A65943.7000302@v.loewis.de>
	<682F74FB-071E-4A82-8A90-E1ECC4A99E77@zope.com>
Message-ID: <46A66549.4010406@benjiyork.com>

Jim Fulton wrote:
> A more precise name would be pydi

/me looks forward to the Mr. T jokes.
-- 
Benji York
http://benjiyork.com

From noah.gift at gmail.com  Wed Jul 25 00:30:31 2007
From: noah.gift at gmail.com (Noah Gift)
Date: Tue, 24 Jul 2007 18:30:31 -0400
Subject: [Catalog-sig] [Distutils] Prototype setuptools-specific PyPI
	index.
In-Reply-To: <20070724221128.D9B193A40A7@sparrow.telecommunity.com>
References: <6C905281-8BF2-4FBA-B7E7-8F2F3BEE5EE1@zope.com>
	<20070724184151.1EAE53A40A7@sparrow.telecommunity.com>
	<46A64AE0.7000307@v.loewis.de>
	<20070724194426.828EE3A40A7@sparrow.telecommunity.com>
	<e91cc0270707241257g4d65e2fblfcbde19983adfa53@mail.gmail.com>
	<46A65AD3.3060607@v.loewis.de>
	<e91cc0270707241307g482cbe9fn96e170820955a731@mail.gmail.com>
	<46A65FDE.7080806@v.loewis.de>
	<e91cc0270707241337r3ee5a9ccwb73286976007b3a7@mail.gmail.com>
	<20070724221128.D9B193A40A7@sparrow.telecommunity.com>
Message-ID: <e91cc0270707241530u60b992abwdc38333e006804ed@mail.gmail.com>

On 7/24/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 04:37 PM 7/24/2007 -0400, Noah Gift wrote:
> >The basic algorithm is that a local index of PyPi could be kept in one
> >file.  If an incorrect search was made, the first action to occur
> >would be to check if the local file was the same as the file on the
> >server.  If not, it would sync the changes with svn.  Then
> >easy_install would try to do lookups against the local file to find a
> >match.
>
> Note that there are a lot of ways you can implement something like
> this without even involving me on the client or Martin on the
> server.  For example, setuptools.package_index uses urllib2 for all
> its URL access, so installing an "opener" that does caching before
> invoking easy_install is possible.  You can also subclass the
> easy_install command class and the PackageIndex class, or tell the
> easy_install command class to use a different PackageIndex implementation.
>
> In the long run, I'd like to add some entry points to allow people to
> extend the search mechanism in such ways, but for now you can
> certainly hack subclasses easily enough and make your own alternative
> commands, as Jim has done for integrating zc.buildout with setuptools.
>

Great suggestion!  I really like that idea.

Does this mean it is also easy to point to another local repository
that is available via NFS?  I guess a local http mirror would work
just as well, if you told the opener about it.

This seems like a good way to instruct a sysadmin on how to setup a
local customized infrastructure!


>


-- 
http://www.blog.noahgift.com

From jim at zope.com  Wed Jul 25 00:31:29 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 24 Jul 2007 18:31:29 -0400
Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org
In-Reply-To: <46A67815.7020807@v.loewis.de>
References: <46A65943.7000302@v.loewis.de>
	<682F74FB-071E-4A82-8A90-E1ECC4A99E77@zope.com>
	<46A67278.6000709@v.loewis.de>
	<304831B1-4E0E-4E54-9DD4-0CD3699DCEF2@zope.com>
	<46A67815.7020807@v.loewis.de>
Message-ID: <2F7239A7-4098-4E0F-BA06-93042CD479C4@zope.com>


On Jul 24, 2007, at 6:07 PM, Martin v. L?wis wrote:

>> I didn't come up with the name "distribution". Distutils did that.
>> Whether we like it or not, the Python Library Reference defines  
>> this term.
>>
>>   http://docs.python.org/dist/distutils-term.html
>>
>> We have a real problem with terminology.
>
> Perhaps. I notice that the page you refer to does *not* define
> the term "distribution", but "module distribution".

Obviously, "module" is modifying the term "distribution". No matter,  
I like your point below.

> I also notice
> that PyPI is not an index for these (i.e. .tar.gz or whatever
> files containing Python modules). Instead, in *indexes* Python
> projects (as Richard calls them, and I think quite correctly
> so). Each project then may have multiple _releases_, and each
> of them may refer to distributions (but not only so, it
> also refers to a home page, an author, a description, Trove
> classifiers, etc).

I think that using the term "project" here addresses the terminology  
issue nicely.  As Phillip pointed out, this is the terminology that  
setuptools uses.  So maybe PyPI should expand to "Python Project Index".

Aside from PyPI, I'd really like to "bless" this terminology.  If we  
all seem to like this term, I'd be happy to try to update the  
distutils documentation to reflect this terminology. (I hope we don't  
need a PEP to adopt this terminology.)

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From renesd at gmail.com  Wed Jul 25 01:34:27 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Wed, 25 Jul 2007 09:34:27 +1000
Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org
In-Reply-To: <46A65943.7000302@v.loewis.de>
References: <46A65943.7000302@v.loewis.de>
Message-ID: <64ddb72c0707241634m7ff701f4y1f6e97db1b0c49be@mail.gmail.com>

Cheeseshop is better I reckon.

pypi sounds like nothing.  Cheeseshop is at least fun... for those
with a sense of humour.


On 7/25/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> After some discussion, it seems that nobody really likes
> the name "cheeseshop" for the Python Package Index,
> and some people seem to actively hate it.
>
> So I'm going to change the name (again/back): the software
> will call itself "Python Package Index", abbreviated as
> pypi (PyPI where case matters). The machine address
> cheeseshop.python.org will continue to work for a
> foreseeable future, but will not be actively advertised.
>
> Regards,
> Martin
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>

From waterbug at pangalactic.us  Wed Jul 25 03:56:17 2007
From: waterbug at pangalactic.us (Stephen Waterbury)
Date: Tue, 24 Jul 2007 21:56:17 -0400
Subject: [Catalog-sig] [Distutils] We need to make a decision wrt
 distribution names
In-Reply-To: <20070724174248.F40AA3A40A7@sparrow.telecommunity.com>
References: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com>
	<46A62DA7.9000304@pangalactic.us>
	<DD07AC04-3A41-4799-98AC-73F92B3FE50D@zope.com>
	<46A63240.7070003@pangalactic.us>
	<20070724174248.F40AA3A40A7@sparrow.telecommunity.com>
Message-ID: <46A6ADC1.4000008@pangalactic.us>

Phillip J. Eby wrote:
> At 01:09 PM 7/24/2007 -0400, Stephen Waterbury wrote:
>> Actually, I wasn't confused.  :)  I'd suggest a convention that allows
>> a distribution "title" (e.g., "Zope", "Twisted", etc.) and a
>> distribution "name" that would simply be the name of the
>> distribution's top-level package (e.g., "zope", "twisted", etc.),
> 
> This proposal would rule out namespace packages ...

I thought about that.  The rule for namespace distributions would be to
allow dotted names, e.g. "zope.interface", "zope.schema", etc., as are
often currently used.  In fact, in a real sense, those *are* the
top-level packages of namespace packages.

> in addition to being 
> incompatible with existing distribution names.

I thought the point was to come up with a new distribution naming
convention, because there currently isn't one -- but the naming
convention has to be consistent with all existing distribution
names?  Seems a tough constraint.

> Note that package != distribution ...

Yes, I knew that.  Of course, now the discussion seems to suggest
"project" or "project release" might be a better name than
"distribution", and I agree with that.

> -- a distribution may contain zero or 
> more packages (even top-level) ...

Indeed, and I've always disliked multiple top-level packages in an
[installable unit].  I never liked ZODB strewing top-level packages
all over site-packages.  (But I do like ZODB -- thanks Jim et al.!  I'd
just much prefer that it have a top-level "zodb" package.)  Of course,
eggs make site-packages dirs look much tidier, but I'd still prefer
that each [installable unit] have a top-level package, because then
it's obvious where imported modules come from just by looking at
their top-level namespace.

> *and* a single package (top-level or 
> otherwise) may be spread over more than one distribution.

IMO, a package that's spread over more than one distribution should
probably not be top-level in both distributions.  :)

BTW, I am not emotionally attached to this proposal (good thing, eh? ;),
but there are a couple of principles in it that I thought deserved a
little bit of logical advocacy, e.g.:

* if a package deserves a "top-level" namespace, it probably also
deserves have its own [installable unit].

* although package != [installable unit], I still think it's
not illogical to use the top-level package of an [installable unit] as
part of its canonical unique identifier.  But admittedly one would have
to agree with some of my other points above to agree with that.

Steve


From pje at telecommunity.com  Wed Jul 25 04:19:03 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 24 Jul 2007 22:19:03 -0400
Subject: [Catalog-sig] [Distutils] We need to make a decision wrt
 distribution names
In-Reply-To: <46A6ADC1.4000008@pangalactic.us>
References: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com>
	<46A62DA7.9000304@pangalactic.us>
	<DD07AC04-3A41-4799-98AC-73F92B3FE50D@zope.com>
	<46A63240.7070003@pangalactic.us>
	<20070724174248.F40AA3A40A7@sparrow.telecommunity.com>
	<46A6ADC1.4000008@pangalactic.us>
Message-ID: <20070725021710.B45F43A40A7@sparrow.telecommunity.com>

At 09:56 PM 7/24/2007 -0400, Stephen Waterbury wrote:
>I thought the point was to come up with a new distribution naming
>convention,

Nope, just clarify the rules for *distinguishing* projects by name -- 
a much less ambitious goal, since it's pretty easy to do with little 
or no impact on existing projects.

A new naming convention isn't in scope, since it would require a 
"boil the ocean" renaming effort to implement, assuming you could get 
everyone to agree in the first place.


From waterbug at pangalactic.us  Wed Jul 25 04:56:03 2007
From: waterbug at pangalactic.us (Stephen Waterbury)
Date: Tue, 24 Jul 2007 22:56:03 -0400
Subject: [Catalog-sig] [Distutils] We need to make a decision wrt
 distribution names
In-Reply-To: <20070725021710.B45F43A40A7@sparrow.telecommunity.com>
References: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com>	<46A62DA7.9000304@pangalactic.us>	<DD07AC04-3A41-4799-98AC-73F92B3FE50D@zope.com>	<46A63240.7070003@pangalactic.us>	<20070724174248.F40AA3A40A7@sparrow.telecommunity.com>	<46A6ADC1.4000008@pangalactic.us>
	<20070725021710.B45F43A40A7@sparrow.telecommunity.com>
Message-ID: <46A6BBC3.7000505@pangalactic.us>

Phillip J. Eby wrote:
> At 09:56 PM 7/24/2007 -0400, Stephen Waterbury wrote:
>> I thought the point was to come up with a new distribution naming
>> convention,
> 
> Nope, just clarify the rules for *distinguishing* projects by name -- 
> a much less ambitious goal, since it's pretty easy to do with little 
> or no impact on existing projects.
> 
> A new naming convention isn't in scope, since it would require a 
> "boil the ocean" renaming effort to implement, assuming you could get 
> everyone to agree in the first place.

Indeed.  Boiling the ocean will have to wait.  I still think putting
multiple top-level packages in a single installable is a mistake.  ;)

Peace.
Steve

From martin at v.loewis.de  Wed Jul 25 07:34:01 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 25 Jul 2007 07:34:01 +0200
Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org
In-Reply-To: <f85tlo$pdo$1@sea.gmane.org>
References: <46A65943.7000302@v.loewis.de>	<682F74FB-071E-4A82-8A90-E1ECC4A99E77@zope.com>	<46A67278.6000709@v.loewis.de>	<304831B1-4E0E-4E54-9DD4-0CD3699DCEF2@zope.com>	<46A67815.7020807@v.loewis.de>
	<f85tlo$pdo$1@sea.gmane.org>
Message-ID: <46A6E0C9.1050802@v.loewis.de>

> So why not change the name to "Python Project Index"? The abbreviation
> stays the same...

Because I'm uncertain how people will react to it. The unabbreviated
name really doesn't matter that much, except for Google searches
perhaps. If "Project Index" gets a clear preference over "Package
Index", we can try that; if people are likely to object to it also,
nothing is gained.

Regards,
Martin

From benji at benjiyork.com  Wed Jul 25 14:40:28 2007
From: benji at benjiyork.com (Benji York)
Date: Wed, 25 Jul 2007 08:40:28 -0400
Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org
In-Reply-To: <46A6E0C9.1050802@v.loewis.de>
References: <46A65943.7000302@v.loewis.de>	<682F74FB-071E-4A82-8A90-E1ECC4A99E77@zope.com>	<46A67278.6000709@v.loewis.de>	<304831B1-4E0E-4E54-9DD4-0CD3699DCEF2@zope.com>	<46A67815.7020807@v.loewis.de>	<f85tlo$pdo$1@sea.gmane.org>
	<46A6E0C9.1050802@v.loewis.de>
Message-ID: <46A744BC.5010702@benjiyork.com>

Martin v. L?wis wrote:
>> So why not change the name to "Python Project Index"? The
>> abbreviation stays the same...
> 
> Because I'm uncertain how people will react to it. [...] If "Project
> Index" gets a clear preference over "Package Index", we can try that;
> if people are likely to object to it also, nothing is gained.

This looks like a bike shed from here.  I suggest you and whomever you 
want to consult pick a name and go with it.  And I repent for my earlier 
paint color suggestion. <wink>
-- 
Benji York
http://benjiyork.com

From jim at zope.com  Wed Jul 25 15:25:38 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 25 Jul 2007 09:25:38 -0400
Subject: [Catalog-sig] [Distutils] We need to make a decision wrt
	distribution names
In-Reply-To: <20070725021710.B45F43A40A7@sparrow.telecommunity.com>
References: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com>
	<46A62DA7.9000304@pangalactic.us>
	<DD07AC04-3A41-4799-98AC-73F92B3FE50D@zope.com>
	<46A63240.7070003@pangalactic.us>
	<20070724174248.F40AA3A40A7@sparrow.telecommunity.com>
	<46A6ADC1.4000008@pangalactic.us>
	<20070725021710.B45F43A40A7@sparrow.telecommunity.com>
Message-ID: <DAAC3AB3-D52B-4633-B3E2-3F36E156C95C@zope.com>


On Jul 24, 2007, at 10:19 PM, Phillip J. Eby wrote:

> At 09:56 PM 7/24/2007 -0400, Stephen Waterbury wrote:
>> I thought the point was to come up with a new distribution naming
>> convention,
>
> Nope, just clarify the rules for *distinguishing* projects by name --
> a much less ambitious goal, since it's pretty easy to do with little
> or no impact on existing projects.

I mostly agree, except that I think we also need to define what is  
legal in a project name.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Wed Jul 25 15:30:45 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 25 Jul 2007 09:30:45 -0400
Subject: [Catalog-sig] [Distutils] We need to make a decision wrt
	distribution names
In-Reply-To: <46A6ADC1.4000008@pangalactic.us>
References: <7AC9ED0E-FFAF-4493-9EBF-068538F2ABA9@zope.com>
	<46A62DA7.9000304@pangalactic.us>
	<DD07AC04-3A41-4799-98AC-73F92B3FE50D@zope.com>
	<46A63240.7070003@pangalactic.us>
	<20070724174248.F40AA3A40A7@sparrow.telecommunity.com>
	<46A6ADC1.4000008@pangalactic.us>
Message-ID: <1882F28C-AC83-4EEE-9F86-979CF5DEB88E@zope.com>


On Jul 24, 2007, at 9:56 PM, Stephen Waterbury wrote:

> Phillip J. Eby wrote:
>> At 01:09 PM 7/24/2007 -0400, Stephen Waterbury wrote:
>>> Actually, I wasn't confused.  :)  I'd suggest a convention that  
>>> allows
>>> a distribution "title" (e.g., "Zope", "Twisted", etc.) and a
>>> distribution "name" that would simply be the name of the
>>> distribution's top-level package (e.g., "zope", "twisted", etc.),
>>
>> This proposal would rule out namespace packages ...
>
> I thought about that.  The rule for namespace distributions would  
> be to
> allow dotted names, e.g. "zope.interface", "zope.schema", etc., as are
> often currently used.  In fact, in a real sense, those *are* the
> top-level packages of namespace packages.

Those are the top-level packages of those distributions.

>> in addition to being
>> incompatible with existing distribution names.
>
> I thought the point was to come up with a new distribution naming
> convention, because there currently isn't one -- but the naming
> convention has to be consistent with all existing distribution
> names?  Seems a tough constraint.

No, my proposal was to define:

- Rules for constructing *legal* (as opposed to "good") project names

- Rules for variations on project names.

...

>> -- a distribution may contain zero or 
>> more packages (even top-level) ...
>
> Indeed, and I've always disliked multiple top-level packages in an
> [installable unit].

No offense intended, but this seems arbitrary to me.  Note that not  
only can a distribution contain more than one package, it can contain  
no packages.

>> *and* a single package (top-level or
>> otherwise) may be spread over more than one distribution.
>
> IMO, a package that's spread over more than one distribution should
> probably not be top-level in both distributions.  :)

Phillip was (I think) referring to namespace packages.  Namespace  
packages are a very important tool for maintaining some sanity in  
package naming.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From jim at zope.com  Wed Jul 25 15:59:31 2007
From: jim at zope.com (Jim Fulton)
Date: Wed, 25 Jul 2007 09:59:31 -0400
Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org
In-Reply-To: <64ddb72c0707241634m7ff701f4y1f6e97db1b0c49be@mail.gmail.com>
References: <46A65943.7000302@v.loewis.de>
	<64ddb72c0707241634m7ff701f4y1f6e97db1b0c49be@mail.gmail.com>
Message-ID: <F6A1C97A-259B-47D6-A494-4E07CBBAF1AB@zope.com>


On Jul 24, 2007, at 7:34 PM, Ren? Dudfield wrote:

> Cheeseshop is better I reckon.
>
> pypi sounds like nothing.  Cheeseshop is at least fun... for those
> with a sense of humour.

Minor hysterical note: The original joke, as I understand it, was  
based on the fact that PyPI originally didn't contain any packages  
(or distributions). Like the cheeseshop having no cheese, the package  
index had no packages (or even distributions).  The original joke  
doesn't really apply any more as many or most of us are actually  
uploading our distributions, so the project index really does indeed  
have cheese, I mean distributions.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org




From renesd at gmail.com  Thu Jul 26 00:50:56 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Thu, 26 Jul 2007 08:50:56 +1000
Subject: [Catalog-sig] Changing cheeseshop.python.org to pypi.python.org
In-Reply-To: <F6A1C97A-259B-47D6-A494-4E07CBBAF1AB@zope.com>
References: <46A65943.7000302@v.loewis.de>
	<64ddb72c0707241634m7ff701f4y1f6e97db1b0c49be@mail.gmail.com>
	<F6A1C97A-259B-47D6-A494-4E07CBBAF1AB@zope.com>
Message-ID: <64ddb72c0707251550y9a5603ay38067556fbcb1584@mail.gmail.com>

ok, that is even funnier - or are all jokes ruined by being explained?


btw, I don't think most people are uploading their packages yet.  You just
have to compare the number of python projects on A) sourceforge/googlecode,
B) python cookbook C) pygame projects
There's still massive amounts of python code not indexed by pypi.

oh, was projects one of the names being considered?  projects.python.org ?

How does the cookbook idea fit in with pypi?  I guess I've seen pypi as
about packages and modules rather than projects and cookbook recipes.  Maybe
those things were left out on purpose?


On 7/25/07, Jim Fulton <jim at zope.com> wrote:
>
>
> On Jul 24, 2007, at 7:34 PM, Ren? Dudfield wrote:
>
> > Cheeseshop is better I reckon.
> >
> > pypi sounds like nothing.  Cheeseshop is at least fun... for those
> > with a sense of humour.
>
> Minor hysterical note: The original joke, as I understand it, was
> based on the fact that PyPI originally didn't contain any packages
> (or distributions). Like the cheeseshop having no cheese, the package
> index had no packages (or even distributions).  The original joke
> doesn't really apply any more as many or most of us are actually
> uploading our distributions, so the project index really does indeed
> have cheese, I mean distributions.
>
> Jim
>
> --
> Jim Fulton                      mailto:jim at zope.com             Python
> Powered!
> CTO                             (540) 361-1714
> http://www.python.org
> Zope Corporation        http://www.zope.com
> http://www.zope.org
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/catalog-sig/attachments/20070726/fd729fa5/attachment.html 

From michael at d2m.at  Fri Jul 27 07:24:15 2007
From: michael at d2m.at (Michael Haubenwallner)
Date: Fri, 27 Jul 2007 07:24:15 +0200
Subject: [Catalog-sig] rss feed: broken links
Message-ID: <f8bvi2$8ja$1@sea.gmane.org>

FYI: the RSS feeds links to packages
(like http://python.python.org/pypi/...)
are broken.
Btw, changing the URLs will likely double the last 30 feeditems in 
aggregators.

Michael

-- 
http://www.zope.org/Members/d2m
http:/planetzope.org


From martin at v.loewis.de  Fri Jul 27 08:03:53 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 27 Jul 2007 08:03:53 +0200
Subject: [Catalog-sig] rss feed: broken links
In-Reply-To: <f8bvi2$8ja$1@sea.gmane.org>
References: <f8bvi2$8ja$1@sea.gmane.org>
Message-ID: <46A98AC9.3050009@v.loewis.de>

Michael Haubenwallner schrieb:
> FYI: the RSS feeds links to packages
> (like http://python.python.org/pypi/...)
> are broken.

Oops, fixed.

> Btw, changing the URLs will likely double the last 30 feeditems in 
> aggregators.

Sure. I don't think anything can be done about that; after a few
days, this is old news, anyway.

Regards,
Martin

From renesd at gmail.com  Sat Jul 28 02:22:16 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Sat, 28 Jul 2007 10:22:16 +1000
Subject: [Catalog-sig] static files, and testing pypi
Message-ID: <64ddb72c0707271722w3da8dfa2x4668f097df6a2c9b@mail.gmail.com>

Hello,

I've got a bit of spare time again after catching up on work after
attending europython - so was wondering if I should still finish the
static file stuff?

Should I still finish the static file generation, or is this not wanted?

I think it would still be useful.  So if it is still wanted, can I
please get a new version of the database?  Since I think there has
been significant changes since my copy of it.


As part of it I want to write some unittests and regression
tests/monitoring scripts.

I think this is sorely needed for pypi, so we don't see the same kind
of breakage when we refactor - and to make sure the service is running
ok.

I guess a tool for this stuff might be the webunit that Richard Jones
wrote?  Or some other tool?  http://mechanicalcat.net/tech/webunit/

Should unittests just be written with unittest?  Or some other framework?


If the maintainers want to stick with no tests I'll just write my
tests separately.  Or I can just set up a basic framework with
unittest, and webunit.

From martin at v.loewis.de  Sat Jul 28 09:34:26 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 28 Jul 2007 09:34:26 +0200
Subject: [Catalog-sig] static files, and testing pypi
In-Reply-To: <64ddb72c0707271722w3da8dfa2x4668f097df6a2c9b@mail.gmail.com>
References: <64ddb72c0707271722w3da8dfa2x4668f097df6a2c9b@mail.gmail.com>
Message-ID: <46AAF182.9070400@v.loewis.de>

> I think it would still be useful.  So if it is still wanted, can I
> please get a new version of the database?  Since I think there has
> been significant changes since my copy of it.

Please take a look at the tools/sql-migrate* files. They should bring
your database up to the current schema.

> I guess a tool for this stuff might be the webunit that Richard Jones
> wrote?  Or some other tool?  http://mechanicalcat.net/tech/webunit/
> 
> Should unittests just be written with unittest?  Or some other framework?

I don't care what framework is chosen - pick any that allows for
completely automated test runs.

Regards,
Martin

From richardjones at optushome.com.au  Sat Jul 28 10:03:04 2007
From: richardjones at optushome.com.au (Richard Jones)
Date: Sat, 28 Jul 2007 18:03:04 +1000
Subject: [Catalog-sig] static files, and testing pypi
In-Reply-To: <46AAF182.9070400@v.loewis.de>
References: <64ddb72c0707271722w3da8dfa2x4668f097df6a2c9b@mail.gmail.com>
	<46AAF182.9070400@v.loewis.de>
Message-ID: <200707281803.04466.richardjones@optushome.com.au>

On Sat, 28 Jul 2007, Martin v. L?wis wrote:
> > I guess a tool for this stuff might be the webunit that Richard Jones
> > wrote?  Or some other tool?  http://mechanicalcat.net/tech/webunit/
> >
> > Should unittests just be written with unittest?  Or some other framework?
>
> I don't care what framework is chosen - pick any that allows for
> completely automated test runs.

FWIW that's pretty much what webunit was designed for.


    Richard

From benji at benjiyork.com  Sat Jul 28 15:02:44 2007
From: benji at benjiyork.com (Benji York)
Date: Sat, 28 Jul 2007 09:02:44 -0400
Subject: [Catalog-sig] static files, and testing pypi
In-Reply-To: <64ddb72c0707271722w3da8dfa2x4668f097df6a2c9b@mail.gmail.com>
References: <64ddb72c0707271722w3da8dfa2x4668f097df6a2c9b@mail.gmail.com>
Message-ID: <46AB3E74.6090303@benjiyork.com>

Ren? Dudfield wrote:
> Should I still finish the static file generation, or is this not wanted?

I like the idea, if only from a stability standpoint. (Granted, 
stability has been improved greatly of late, but static files will 
always trump dynamic page generation).

> As part of it I want to write some unittests and regression
> tests/monitoring scripts.

+1

> I guess a tool for this stuff might be the webunit that Richard Jones
> wrote?  Or some other tool?  http://mechanicalcat.net/tech/webunit/

There's a good list of web testing tools at 
http://pycheesecake.org/wiki/PythonTestingToolsTaxonomy#WebTestingTools. 
  I have a fondness for doctest, so would recommend that as well.
-- 
Benji York
http://benjiyork.com

From martin at v.loewis.de  Sat Jul 28 15:37:57 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 28 Jul 2007 15:37:57 +0200
Subject: [Catalog-sig] static files, and testing pypi
In-Reply-To: <46AB3E74.6090303@benjiyork.com>
References: <64ddb72c0707271722w3da8dfa2x4668f097df6a2c9b@mail.gmail.com>
	<46AB3E74.6090303@benjiyork.com>
Message-ID: <46AB46B5.6020806@v.loewis.de>

> I like the idea, if only from a stability standpoint. (Granted, 
> stability has been improved greatly of late, but static files will 
> always trump dynamic page generation).

Depends on how you define stability, perhaps. If the dynamic generation
on update stops working at some point, such an error may remain
unnoticed for some period of time. The pages remain available, but
are incorrect.

Regards,
Martin

From martin at v.loewis.de  Sat Jul 28 16:22:29 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 28 Jul 2007 16:22:29 +0200
Subject: [Catalog-sig] setuptools upload to pypi
In-Reply-To: <20070723212920.8BFE63A40AA@sparrow.telecommunity.com>
References: <C9BBF98B-C127-40AD-8888-AF4E60EE9D64@lovelysystems.com>
	<46A50BF1.9020303@v.loewis.de>
	<20070723204445.65ABC3A40AA@sparrow.telecommunity.com>
	<46A51798.8000907@v.loewis.de>
	<20070723212920.8BFE63A40AA@sparrow.telecommunity.com>
Message-ID: <46AB5125.2000806@v.loewis.de>

> RewriteEngine On
> RewriteBase /
> RewriteCond %{REQUEST_METHOD} ^GET$
> RewriteRule ^pypi(.*)$
> http://cheeseshop.python.org/pypi$1?%{QUERY_STRING} [R,L]
> RewriteRule ^pypi(.*)$
> http://cheeseshop.python.org/pypi$1?%{QUERY_STRING} [P,L]

Thanks! I have now activated something like this, namely

RewriteCond %{REQUEST_METHOD} ^GET$
RewriteRule ^/pypi(.*)$ http://pypi.python.org/pypi$1?%{QUERY_STRING} [R,L]
RewriteRule ^/pypi(.*)$ http://pypi.python.org/pypi$1?%{QUERY_STRING} [P,L]

I haven't set RewriteBase, as this is in the central server config
and would affect other rules as well.

Regards,
Martin