From chris at simplistix.co.uk  Tue Jun  1 12:14:29 2010
From: chris at simplistix.co.uk (Chris Withers)
Date: Tue, 01 Jun 2010 11:14:29 +0100
Subject: [Catalog-sig] PyPI down?
Message-ID: <4C04DD85.8050501@simplistix.co.uk>

Hi All,

PyPI appears to not be responding.

Anyone know why that is and when normal service might be resumed?

cheers,

Chris


From simon at ikanobori.jp  Tue Jun  1 13:20:41 2010
From: simon at ikanobori.jp (Simon de Vlieger)
Date: Tue, 1 Jun 2010 13:20:41 +0200
Subject: [Catalog-sig] PyPI down?
In-Reply-To: <4C04DD85.8050501@simplistix.co.uk>
References: <4C04DD85.8050501@simplistix.co.uk>
Message-ID: <36F7B98F-B26D-4451-8A06-BA5E86884E2E@ikanobori.jp>

Chris,

PyPi seemed to be unresponsive earlier during the day but currently it  
looks like normal service is resumed.

Regards, Simon.

On 1 jun 2010, at 12:14, Chris Withers wrote:

> PyPI appears to not be responding.
>
> Anyone know why that is and when normal service might be resumed?

From chris at simplistix.co.uk  Tue Jun  1 13:41:36 2010
From: chris at simplistix.co.uk (Chris Withers)
Date: Tue, 01 Jun 2010 12:41:36 +0100
Subject: [Catalog-sig] PyPI down?
In-Reply-To: <36F7B98F-B26D-4451-8A06-BA5E86884E2E@ikanobori.jp>
References: <4C04DD85.8050501@simplistix.co.uk>
	<36F7B98F-B26D-4451-8A06-BA5E86884E2E@ikanobori.jp>
Message-ID: <4C04F1F0.2020309@simplistix.co.uk>

Simon de Vlieger wrote:
> PyPi seemed to be unresponsive earlier during the day but currently it 
> looks like normal service is resumed.

Indeed, it would be good to know what was done to resolve it and by whom ;-)

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk

From jannis at leidel.info  Tue Jun  1 13:08:17 2010
From: jannis at leidel.info (Jannis Leidel)
Date: Tue, 1 Jun 2010 13:08:17 +0200
Subject: [Catalog-sig] PyPI down?
In-Reply-To: <4C04DD85.8050501@simplistix.co.uk>
References: <4C04DD85.8050501@simplistix.co.uk>
Message-ID: <CCE5A3F5-BDA6-4784-9EB9-8CEE8DCD45FF@leidel.info>

Am 01.06.2010 um 12:14 schrieb Chris Withers:

> Hi All,
> 
> PyPI appears to not be responding.
> 
> Anyone know why that is and when normal service might be resumed?

It would certainly be nice to know that, indeed.

What is the official policy with regard to maintainance and failover? I know there is a bigger issue (PEP 381) but in case it's just a matter of manpower to restart the server or kick apache once in a while, I'd happily volunteer to help out.

Jannis

From martin at v.loewis.de  Tue Jun  1 22:59:46 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 01 Jun 2010 22:59:46 +0200
Subject: [Catalog-sig] PyPI down?
In-Reply-To: <4C04F1F0.2020309@simplistix.co.uk>
References: <4C04DD85.8050501@simplistix.co.uk>	<36F7B98F-B26D-4451-8A06-BA5E86884E2E@ikanobori.jp>
	<4C04F1F0.2020309@simplistix.co.uk>
Message-ID: <4C0574C2.4070804@v.loewis.de>

Am 01.06.2010 13:41, schrieb Chris Withers:
> Simon de Vlieger wrote:
>> PyPi seemed to be unresponsive earlier during the day but currently it
>> looks like normal service is resumed.
>
> Indeed, it would be good to know what was done to resolve it and by whom
> ;-)

I restarted Apache.

Regards,
Martin

From chris at simplistix.co.uk  Wed Jun  2 09:31:48 2010
From: chris at simplistix.co.uk (Chris Withers)
Date: Wed, 02 Jun 2010 08:31:48 +0100
Subject: [Catalog-sig] PyPI down?
In-Reply-To: <4C0574C2.4070804@v.loewis.de>
References: <4C04DD85.8050501@simplistix.co.uk>	<36F7B98F-B26D-4451-8A06-BA5E86884E2E@ikanobori.jp>
	<4C04F1F0.2020309@simplistix.co.uk> <4C0574C2.4070804@v.loewis.de>
Message-ID: <4C0608E4.3000005@simplistix.co.uk>

Martin v. L?wis wrote:
>> Indeed, it would be good to know what was done to resolve it and by whom
>> ;-)
> 
> I restarted Apache.

Any idea what had brought it down?
Were there lots of worker threads? High CPU usage? Memory starvation?
Does the database that backs PyPI live on the same machine?

cheers,

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
            - http://www.simplistix.co.uk

From chris at simplistix.co.uk  Fri Jun 11 12:44:07 2010
From: chris at simplistix.co.uk (Chris Withers)
Date: Fri, 11 Jun 2010 11:44:07 +0100
Subject: [Catalog-sig] PyPI down again...
Message-ID: <4C121377.4000008@simplistix.co.uk>

...would be good to know what brought it down before and what has 
brought it down again.

As an interim solution, what do I need to do to get access to the box 
running PyPI so I can get in and investigate/restart Apache?

cheers,

Chris


From mal at egenix.com  Fri Jun 11 12:48:55 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 11 Jun 2010 12:48:55 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C121377.4000008@simplistix.co.uk>
References: <4C121377.4000008@simplistix.co.uk>
Message-ID: <4C121497.5040806@egenix.com>

Chris Withers wrote:
> ...would be good to know what brought it down before and what has
> brought it down again.

It works for me.

> As an interim solution, what do I need to do to get access to the box
> running PyPI so I can get in and investigate/restart Apache?

Since PyPI is a rather essential Python resource, is there some
monitoring in place to automatically notify the webmasters ?

Something like e.g. a Zenoss instance checking whether PyPI is
pingable.

If not, we'd need to address this in the PSF infrastructure committee.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 11 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                37 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From chris at simplistix.co.uk  Fri Jun 11 12:50:46 2010
From: chris at simplistix.co.uk (Chris Withers)
Date: Fri, 11 Jun 2010 11:50:46 +0100
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C121497.5040806@egenix.com>
References: <4C121377.4000008@simplistix.co.uk> <4C121497.5040806@egenix.com>
Message-ID: <4C121506.9070708@simplistix.co.uk>

M.-A. Lemburg wrote:
> Chris Withers wrote:
>> ...would be good to know what brought it down before and what has
>> brought it down again.
> 
> It works for me.

Yes, I guess someone went in and did something.
Given that the topic in #python says its down and a couple of those 
"down for me or everyone" websites all confirmed it when I was having 
problems...

>> As an interim solution, what do I need to do to get access to the box
>> running PyPI so I can get in and investigate/restart Apache?
> 
> Since PyPI is a rather essential Python resource, is there some
> monitoring in place to automatically notify the webmasters ?

Good question...

> Something like e.g. a Zenoss instance checking whether PyPI is
> pingable.

Hmm, not enough. I suspect the box would have been pingable, it's just 
the web app that is getting wedged...

Chris


From marrakis at gmail.com  Fri Jun 11 12:50:52 2010
From: marrakis at gmail.com (Mathieu Leduc-Hamel)
Date: Fri, 11 Jun 2010 12:50:52 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C121377.4000008@simplistix.co.uk>
References: <4C121377.4000008@simplistix.co.uk>
Message-ID: <AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>

Up again...

For sure some monitoring and logging informations would be great. I'm
working right now to test the code validity with unittests and after that I
would like implemented a couple of new functionalities like that.

Who is responsible of the project and the maintenance ?

I was starting to work on pypi with tarek ziade, to implement distutils 2
new metadata, and I'm completely focus on quality and things like that.

On Fri, Jun 11, 2010 at 12:44 PM, Chris Withers <chris at simplistix.co.uk>wrote:

> ...would be good to know what brought it down before and what has brought
> it down again.
>
> As an interim solution, what do I need to do to get access to the box
> running PyPI so I can get in and investigate/restart Apache?
>
> cheers,
>
> Chris
>
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100611/497c2e2c/attachment.html>

From martin at v.loewis.de  Fri Jun 11 20:17:56 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Fri, 11 Jun 2010 20:17:56 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
Message-ID: <4C127DD4.5010801@v.loewis.de>

> Who is responsible of the project and the maintenance ?

I am.

Regards,
Martin

From justin.ryan at reliefgarden.org  Fri Jun 11 22:09:30 2010
From: justin.ryan at reliefgarden.org (Justin Ryan)
Date: Fri, 11 Jun 2010 13:09:30 -0700
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C127DD4.5010801@v.loewis.de>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com> 
	<4C127DD4.5010801@v.loewis.de>
Message-ID: <AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>

Is it possible it's time to designate a team?  I'm sure everyone
appreciates the hard work of a lone volunteer, but having been one
myself at times, the feeling that others may not do the job right is
often eclipsed by their availability to try.

It seems like for months if not years, those of us relying on PyPI for
day-to-day use, esp for deployments and developer environments like
buildout, run into issues where we simply can't work for a significant
part of a day.

What's up with this years-old PEP for expanding the PyPI
infrastructure?  Are there resources, relationships, volunteers
lacking?

What can we do to help? :)

Best,

J

On Fri, Jun 11, 2010 at 11:17 AM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> Who is responsible of the project and the maintenance ?
>
> I am.
>

From martin at v.loewis.de  Fri Jun 11 22:56:04 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 11 Jun 2010 22:56:04 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
Message-ID: <4C12A2E4.2090305@v.loewis.de>

> Is it possible it's time to designate a team?  I'm sure everyone
> appreciates the hard work of a lone volunteer, but having been one
> myself at times, the feeling that others may not do the job right is
> often eclipsed by their availability to try.

Help is certainly appreciated. The type of help depends on the 
volunteer, of course. E.g. I wouldn't want to give root accounts to
the first person that comes along and asks for them (except when the 
first person is Jannis Leidel, who (I believe) did the Apache restart
today).

> What's up with this years-old PEP for expanding the PyPI
> infrastructure?  Are there resources, relationships, volunteers
> lacking?
>
> What can we do to help? :)

If you are willing to invest *a lot* of time, then it seems that 
rewriting PyPI in Django would make a lot of people happy, because
they claim they can't contribute to the current code base because
they don't understand that. I don't want to do such a rewrite on
my own because I *do* understand the code base (despite not having 
written it in the first place, so I think that if you really want
to contribute, you can learn how it works); it also violates Joel
Spolsky's principle of never ever doing rewrites.

It will be a lot of work because it must implement full compatibility 
with the current code, which I can promise will keep you busy. Full
compatibility is primarily defined in terms of URLs that people may
have put on the web and into Google, and URLs and API that setuptools
back to very old releases may use.

That said, I have no idea what is causing the current outages. There 
must be some secret ping of death or something that somebody discovered.

For a smaller project, start putting mirror support into setuptools or 
distribute; this would make short (several hours) outages less severe 
for the class of users that want permanent availability for downloading.
It's unlikely that the mirrors would break when the master goes down;
they just stop mirroring.

Regards,
Martin

From justin.ryan at reliefgarden.org  Fri Jun 11 23:05:19 2010
From: justin.ryan at reliefgarden.org (Justin Ryan)
Date: Fri, 11 Jun 2010 14:05:19 -0700
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C12A2E4.2090305@v.loewis.de>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com> 
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com> 
	<4C12A2E4.2090305@v.loewis.de>
Message-ID: <AANLkTilvHoSVcVpExgbsOe0bqOYir4A3LZoIo794_tSV@mail.gmail.com>

On Fri, Jun 11, 2010 at 1:56 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
...
> Help is certainly appreciated. The type of help depends on the volunteer, of
> course. E.g. I wouldn't want to give root accounts to
> the first person that comes along and asks for them (except when the first
> person is Jannis Leidel, who (I believe) did the Apache restart
> today).
>

Thanks, Jannis. :)

>> What's up with this years-old PEP for expanding the PyPI
>> infrastructure? ?Are there resources, relationships, volunteers
>> lacking?
>>
>> What can we do to help? :)
>
> If you are willing to invest *a lot* of time, then it seems that rewriting
> PyPI in Django would make a lot of people happy, because
> they claim they can't contribute to the current code base because
> they don't understand that. I don't want to do such a rewrite on
> my own because I *do* understand the code base (despite not having written
> it in the first place, so I think that if you really want
> to contribute, you can learn how it works); it also violates Joel
> Spolsky's principle of never ever doing rewrites.

I'll avoid deep comments about my general feelings about Django here. ;)

What is it now, just a straight WSGI app?

> It will be a lot of work because it must implement full compatibility with
> the current code, which I can promise will keep you busy. Full
> compatibility is primarily defined in terms of URLs that people may
> have put on the web and into Google, and URLs and API that setuptools
> back to very old releases may use.

Sure..

> That said, I have no idea what is causing the current outages. There must be
> some secret ping of death or something that somebody discovered.

If you want to give me a shell that can just access ps and top for
now, read-only access to log files, I can try and put some time into
keeping an eye.

> For a smaller project, start putting mirror support into setuptools or
> distribute; this would make short (several hours) outages less severe for
> the class of users that want permanent availability for downloading.
> It's unlikely that the mirrors would break when the master goes down;
> they just stop mirroring.

That's a really great idea.  I try to use egg caches in buildout and
the -N option to not look for the newest of everything all the time,
but I think it needs a bit of work as well.  We also have support for
alternate download targets in buildout, but it seems the failure mode
when PyPI is down is weak.

So, there's definitely two sides to this, we all need to be gentler
and calmer users of PyPI, and we all need it to work more.  And anyone
putting time into restarting Apache probably wants to stop doing that.
:)

Anyone interested in helping to add mirror support to distribute?  I
suspect it is distribute / setuptools which are tied to the poor
failure mode I'm encountering with zc.buildout.

Best!

J

From mal at egenix.com  Fri Jun 11 23:06:21 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 11 Jun 2010 23:06:21 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C12A2E4.2090305@v.loewis.de>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de>
Message-ID: <4C12A54D.1070406@egenix.com>

"Martin v. L?wis" wrote:
> For a smaller project, start putting mirror support into setuptools or
> distribute; this would make short (several hours) outages less severe
> for the class of users that want permanent availability for downloading.
> It's unlikely that the mirrors would break when the master goes down;
> they just stop mirroring.

A better and cleaner strategy is to put the static PyPI information
up on Amazon Cloudscape and have DNS take care of providing local
mirrors (edge servers) to setuptools et al.

Such a setup won't require any complicated mirror logic in any
of the existing client tools.

By moving the PyPI installation to Amazon AWS, we could also
get the RPC access distributed to more than just one server.

As I said before, the PSF infrastructure committee needs to get on
of the job of getting this implemented (including funding this
development).

If someone wants to volunteer helping with the setup, please contact
the PSF at psf at python.org.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 11 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                37 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From ziade.tarek at gmail.com  Fri Jun 11 23:07:03 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Fri, 11 Jun 2010 23:07:03 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C12A2E4.2090305@v.loewis.de>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de>
Message-ID: <AANLkTikCByKmvf9UDoQ5IKkPsPFB-NW-A1ifEhttSDPI@mail.gmail.com>

On Fri, Jun 11, 2010 at 10:56 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> Is it possible it's time to designate a team? ?I'm sure everyone
>> appreciates the hard work of a lone volunteer, but having been one
>> myself at times, the feeling that others may not do the job right is
>> often eclipsed by their availability to try.
>
> Help is certainly appreciated. The type of help depends on the volunteer, of
> course. E.g. I wouldn't want to give root accounts to
> the first person that comes along and asks for them (except when the first
> person is Jannis Leidel, who (I believe) did the Apache restart
> today).
>
>> What's up with this years-old PEP for expanding the PyPI
>> infrastructure? ?Are there resources, relationships, volunteers
>> lacking?
>>
>> What can we do to help? :)
>
> If you are willing to invest *a lot* of time, then it seems that rewriting
> PyPI in Django would make a lot of people happy, because
> they claim they can't contribute to the current code base because
> they don't understand that. I don't want to do such a rewrite on
> my own because I *do* understand the code base (despite not having written
> it in the first place, so I think that if you really want
> to contribute, you can learn how it works); it also violates Joel
> Spolsky's principle of never ever doing rewrites.

-1

PyPI code is evolving. I've added with the help of Mathieu PEP 345 support,
and we have more stuff coming up.

Mathieu has also invested quite some time lately to write functional
tests in PyPI
and split the huge web.py module in sevral modules for clarity

I was planning to review it and ask you before I would merge it to trunk.

I think we need to make a difference here between the development of
the PyPI codebase and the sysadmin work

Tarek

-- 
Tarek Ziad? | http://ziade.org

From ziade.tarek at gmail.com  Fri Jun 11 23:11:35 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Fri, 11 Jun 2010 23:11:35 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C12A54D.1070406@egenix.com>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de> <4C12A54D.1070406@egenix.com>
Message-ID: <AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>

On Fri, Jun 11, 2010 at 11:06 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> "Martin v. L?wis" wrote:
>> For a smaller project, start putting mirror support into setuptools or
>> distribute; this would make short (several hours) outages less severe
>> for the class of users that want permanent availability for downloading.
>> It's unlikely that the mirrors would break when the master goes down;
>> they just stop mirroring.
>
> A better and cleaner strategy is to put the static PyPI information
> up on Amazon Cloudscape and have DNS take care of providing local
> mirrors (edge servers) to setuptools et al.
>
> Such a setup won't require any complicated mirror logic in any
> of the existing client tools.
>
> By moving the PyPI installation to Amazon AWS, we could also
> get the RPC access distributed to more than just one server.
>
> As I said before, the PSF infrastructure committee needs to get on
> of the job of getting this implemented (including funding this
> development).
>
> If someone wants to volunteer helping with the setup, please contact
> the PSF at psf at python.org.

What about continuing the work that was started last year ?
(and not finished due to a lack of time)

There's a PEP we have started about a mirroring infrastructure:
http://www.python.org/dev/peps/pep-0381/

Some of its parts are already implemented in PyPI, and
what we need now is to work on the client side (pip, distribute, etc)
and bootstrap one or two mirrors using the protocol.

Regards
Tarek

From martin at v.loewis.de  Sat Jun 12 00:27:46 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 12 Jun 2010 00:27:46 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTilvHoSVcVpExgbsOe0bqOYir4A3LZoIo794_tSV@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de>
	<AANLkTilvHoSVcVpExgbsOe0bqOYir4A3LZoIo794_tSV@mail.gmail.com>
Message-ID: <4C12B862.9000503@v.loewis.de>

> What is it now, just a straight WSGI app?

No, FCGI.

> If you want to give me a shell that can just access ps and top for
> now, read-only access to log files, I can try and put some time into
> keeping an eye.

Sorry, no: I don't know you at all.

Regards,
Martin

From martin at v.loewis.de  Sat Jun 12 00:38:25 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 12 Jun 2010 00:38:25 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTikCByKmvf9UDoQ5IKkPsPFB-NW-A1ifEhttSDPI@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>	<4C12A2E4.2090305@v.loewis.de>
	<AANLkTikCByKmvf9UDoQ5IKkPsPFB-NW-A1ifEhttSDPI@mail.gmail.com>
Message-ID: <4C12BAE1.3040206@v.loewis.de>

>> If you are willing to invest *a lot* of time, then it seems that rewriting
>> PyPI in Django would make a lot of people happy
>
> -1
>
> PyPI code is evolving. I've added with the help of Mathieu PEP 345 support,
> and we have more stuff coming up.

I can understand why you are opposed: for the same reason I don't want 
to lead such a project. We both have invested time into the PyPI code 
base, and I disagree with all the complaints I heard about it being 
incomprehensible.

The fact remains that people continue to consider the code 
incomprehensible, and that those very people claimed that they would 
prefer if some other web framework was used, specifically Django.
I know Richard Jones is also in favor of a rewrite in Django.

I can also understand that Zope fans might be upset by the prospect of 
having to use Django; to those, I'd say "get over it".

> I think we need to make a difference here between the development of
> the PyPI codebase and the sysadmin work

Most definitely.

Regards,
Martin

From justin.ryan at reliefgarden.org  Sat Jun 12 00:39:50 2010
From: justin.ryan at reliefgarden.org (Justin Ryan)
Date: Fri, 11 Jun 2010 15:39:50 -0700
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C12B862.9000503@v.loewis.de>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com> 
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com> 
	<4C12A2E4.2090305@v.loewis.de>
	<AANLkTilvHoSVcVpExgbsOe0bqOYir4A3LZoIo794_tSV@mail.gmail.com> 
	<4C12B862.9000503@v.loewis.de>
Message-ID: <AANLkTilC2kSUQr1Du6fdtCYaLUVmLsYbXFcgcc7Z9poP@mail.gmail.com>

On Fri, Jun 11, 2010 at 3:27 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:

>> If you want to give me a shell that can just access ps and top for
>> now, read-only access to log files, I can try and put some time into
>> keeping an eye.
>
> Sorry, no: I don't know you at all.
>

I know that, I wasn't asking for today.  I had access to the main
plone.org box at some point, siggraph.org, acm.org, so maybe we should
get to know each other.

From justin.ryan at reliefgarden.org  Sat Jun 12 21:48:23 2010
From: justin.ryan at reliefgarden.org (Justin Ryan)
Date: Sat, 12 Jun 2010 12:48:23 -0700
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C13450A.9050104@v.loewis.de>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com> 
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com> 
	<4C12A2E4.2090305@v.loewis.de>
	<AANLkTilvHoSVcVpExgbsOe0bqOYir4A3LZoIo794_tSV@mail.gmail.com> 
	<4C12B862.9000503@v.loewis.de>
	<AANLkTilC2kSUQr1Du6fdtCYaLUVmLsYbXFcgcc7Z9poP@mail.gmail.com> 
	<4C12C07F.6000706@v.loewis.de>
	<AANLkTils7kGRLgxdwZrtBVcqRdcJIbvcaG0ITivpVCtT@mail.gmail.com> 
	<4C1338A6.1020601@v.loewis.de>
	<AANLkTincczw85Mp9eOvBue3da2qGRRcRdxip4EOtJzvg@mail.gmail.com> 
	<4C13450A.9050104@v.loewis.de>
Message-ID: <AANLkTim0twWHBk4ScJDF-0ew-8SNtAjB1xLTeOWiJQBe@mail.gmail.com>

Thanks, Martin, for taking the conversation offline to be a real jerk. ;)

On Sat, Jun 12, 2010 at 1:27 AM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> The question, I think, is what steps can we take to begin alleviating
>> each other's worries?
>>
>> I understand I'm a bit vague, I'm just trying to raise my hand and say
>> hey, let me volunteer.
>
> Ok, so write code.
>

I was looking for guidance as to what to do.

This common response really pisses me off.  It's unclear exactly what
code needs to be written.

>> I think Chris Withers and some others have been doing same. ?People
>> say to email psf, and then when I do, that it's inappropriate, I think
>> we just want some direction on contribution we can make that won't
>> disappear into politic.
>
> I'm not quite sure what specific problem you want to solve. Or, if it's not
> a specific problem, what general problem you want to solve.
>

PyPI is fucking down all the time you nincompoop.

> My observation over the years is this: everything works fine for some time,
> and there are *zero* contributors. Then, a small problem occurs,
> and people offer help and demand drastic changes. Then the problem gets
> solved, and people disappear again.
>

So, turn away help, because we can all go to hell.  I've seen a
problem for a year and I joined the catalog-sig for other reasons, but
find that people ask questions which aren't answered.

Your attitude is very much like a senior employee who is about to get
fired for being unpalatable.

But you can't be fired by the community, so you'll continue to reign
and noone should offer to help because someone you thought would help
in the past didn't.

>> Question is, I guess:
>>
>> ?What, exactly, should I do?
>>
>> Why should I be directing the PyPI leader?
>
> I thought you had a proposal on how to solve the problem at hand.
> So I wasn't asking for direction, but for advise.
>

Like most people in all general e-mail communication of the world, you
didn't read the thread closely enough to determine who said what.

I responded to a proposal by someone who you completely ignored.

I can see what is wrong with PyPI, Martin.  I think it's painfully clear.

Anyway, I won't offer to help ever again, promise.  I'll just complain
until you fix things.

Peace, Love, and Go to Hell.

From martin at v.loewis.de  Sun Jun 13 01:29:03 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 13 Jun 2010 01:29:03 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTim0twWHBk4ScJDF-0ew-8SNtAjB1xLTeOWiJQBe@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de>
	<AANLkTilvHoSVcVpExgbsOe0bqOYir4A3LZoIo794_tSV@mail.gmail.com>
	<4C12B862.9000503@v.loewis.de>
	<AANLkTilC2kSUQr1Du6fdtCYaLUVmLsYbXFcgcc7Z9poP@mail.gmail.com>
	<4C12C07F.6000706@v.loewis.de>
	<AANLkTils7kGRLgxdwZrtBVcqRdcJIbvcaG0ITivpVCtT@mail.gmail.com>
	<4C1338A6.1020601@v.loewis.de>
	<AANLkTincczw85Mp9eOvBue3da2qGRRcRdxip4EOtJzvg@mail.gmail.com>
	<4C13450A.9050104@v.loewis.de>
	<AANLkTim0twWHBk4ScJDF-0ew-8SNtAjB1xLTeOWiJQBe@mail.gmail.com>
Message-ID: <4C14183F.7010006@v.loewis.de>

>> Ok, so write code.
>>
>
> I was looking for guidance as to what to do.
>
> This common response really pisses me off.  It's unclear exactly what
> code needs to be written.

The one I proposed to write: add mirroring support to setuptools
and distribute.

> PyPI is fucking down all the time you nincompoop.

Never heard that term before...

In any case, I don't think this is factually correct.

> But you can't be fired by the community, so you'll continue to reign
> and noone should offer to help because someone you thought would help
> in the past didn't.

So prove me wrong. Actually do start helping, instead of insulting.

Regards,
Martin

From guido at python.org  Sun Jun 13 06:32:56 2010
From: guido at python.org (Guido van Rossum)
Date: Sat, 12 Jun 2010 21:32:56 -0700
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTim0twWHBk4ScJDF-0ew-8SNtAjB1xLTeOWiJQBe@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com> 
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com> 
	<4C12A2E4.2090305@v.loewis.de>
	<AANLkTilvHoSVcVpExgbsOe0bqOYir4A3LZoIo794_tSV@mail.gmail.com> 
	<4C12B862.9000503@v.loewis.de>
	<AANLkTilC2kSUQr1Du6fdtCYaLUVmLsYbXFcgcc7Z9poP@mail.gmail.com> 
	<4C12C07F.6000706@v.loewis.de>
	<AANLkTils7kGRLgxdwZrtBVcqRdcJIbvcaG0ITivpVCtT@mail.gmail.com> 
	<4C1338A6.1020601@v.loewis.de>
	<AANLkTincczw85Mp9eOvBue3da2qGRRcRdxip4EOtJzvg@mail.gmail.com> 
	<4C13450A.9050104@v.loewis.de>
	<AANLkTim0twWHBk4ScJDF-0ew-8SNtAjB1xLTeOWiJQBe@mail.gmail.com>
Message-ID: <AANLkTikt0U8YJU5eAFKysW1L14zs42LT_R1gadXjNutq@mail.gmail.com>

On Sat, Jun 12, 2010 at 12:48 PM, Justin Ryan
<justin.ryan at reliefgarden.org> wrote:
> Thanks, Martin, for taking the conversation offline to be a real jerk. ;)

(I won't quote more. Everyone who read it is still reeling from the
sudden outburst.)

Justin, go wash your mouth with soap. You may be used to this kind of
language in other places, but it is inappropriate here and you will
not get the respect or guidance you are seeking by swearing or
insulting people.

BTW there is no way I can understand your use of the smiley here.

-- 
--Guido van Rossum (python.org/~guido)

From ubernostrum at gmail.com  Sun Jun 13 07:10:26 2010
From: ubernostrum at gmail.com (James Bennett)
Date: Sun, 13 Jun 2010 00:10:26 -0500
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C12B862.9000503@v.loewis.de>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de>
	<AANLkTilvHoSVcVpExgbsOe0bqOYir4A3LZoIo794_tSV@mail.gmail.com>
	<4C12B862.9000503@v.loewis.de>
Message-ID: <AANLkTimzi4RSjfYKfpgHksitQndkctOvnxBk3_AQa0Ie@mail.gmail.com>

On Fri, Jun 11, 2010 at 5:27 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> What is it now, just a straight WSGI app?
>
> No, FCGI.

Statements like this lead me to believe that ignoring Joel Spolsky
would be the right thing to do.

Right now the PyPI codebase seems to have a bus number[1] of one:
Martin, who is apparently the only person who really understands the
code well enough to do significant work on it. This is something which
could be remedied by having more people learn the code and get
familiar enough with it to make contributions, but that's complicated
by the fact that PyPI still does so much basically from scratch -- it
doesn't even use the standard gateway interface Python web developers
are expected to be familiar with, much less any well-known libraries.

As such, just having people learn the code doesn't seem like a great
option; for one thing, existing knowledge of Python web development
isn't transferrable to PyPI, and working on PyPI isn't transferrable
to anything else a Python web developer would be doing, and so it's
unlikely that many, if any, people would be sufficiently motivated.
Which points to rewriting as the best option, resulting in greater
innate maintainability and a larger community of potential
contributors.

As to *what* it should be rewritten with, I frankly don't care so long
as it's something reasonably well-known and well-understood within the
broader Python web community, and speaks WSGI (which is essentially
the same thing, but it needs to be said). That gives all sorts of
options, from a lightweight stack on something like Werkzeug all the
way up to a full framework solution with something like Pylons. To
avoid the perennial holy wars that choice seems to engender, though,
I'd suggest just asking Martin to pick something he feels he'd be
comfortable with, and having everyone else who wants to help shut up
and go with his choice.


-- 
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."

From martin at v.loewis.de  Sun Jun 13 11:18:36 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 13 Jun 2010 11:18:36 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTimzi4RSjfYKfpgHksitQndkctOvnxBk3_AQa0Ie@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>	<4C12A2E4.2090305@v.loewis.de>	<AANLkTilvHoSVcVpExgbsOe0bqOYir4A3LZoIo794_tSV@mail.gmail.com>	<4C12B862.9000503@v.loewis.de>
	<AANLkTimzi4RSjfYKfpgHksitQndkctOvnxBk3_AQa0Ie@mail.gmail.com>
Message-ID: <4C14A26C.2050804@v.loewis.de>

> Right now the PyPI codebase seems to have a bus number[1] of one:
> Martin, who is apparently the only person who really understands the
> code well enough to do significant work on it. This is something which
> could be remedied by having more people learn the code and get
> familiar enough with it to make contributions, but that's complicated
> by the fact that PyPI still does so much basically from scratch -- it
> doesn't even use the standard gateway interface Python web developers
> are expected to be familiar with, much less any well-known libraries.

There are several ways to run PyPI, including WSGI, FCGI, CGI, and a
stand-alone server. The mode which is used on PyPI just happens to be FCGI.

I'm not sure how the integration with Apache matters - the actual code 
generating web pages is the same all the time, no matter what gateway 
interface is being used.

As for the bus number: Richard Jones is also familiar with the code, as 
he wrote it in the first place. He just didn't contribute much lately.
I believe Tarek is also knowledgable. So the bus factor is rather 3.

> As to *what* it should be rewritten with, I frankly don't care so long
> as it's something reasonably well-known and well-understood within the
> broader Python web community, and speaks WSGI (which is essentially
> the same thing, but it needs to be said).

I don't really want to "sell" the code base, but just for the record:
It's written "in" WSGI, Zope Page Templates, and Postgres. These are
all things that are well-understood in the Python web community.

> That gives all sorts of
> options, from a lightweight stack on something like Werkzeug all the
> way up to a full framework solution with something like Pylons. To
> avoid the perennial holy wars that choice seems to engender, though,
> I'd suggest just asking Martin to pick something he feels he'd be
> comfortable with, and having everyone else who wants to help shut up
> and go with his choice.

It would be really up to Richard Jones, and he said he would prefer
Django; so do I.

Regards,
Martin


From g.brandl at gmx.net  Sun Jun 13 12:09:32 2010
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 13 Jun 2010 12:09:32 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTimzi4RSjfYKfpgHksitQndkctOvnxBk3_AQa0Ie@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>	<4C12A2E4.2090305@v.loewis.de>	<AANLkTilvHoSVcVpExgbsOe0bqOYir4A3LZoIo794_tSV@mail.gmail.com>	<4C12B862.9000503@v.loewis.de>
	<AANLkTimzi4RSjfYKfpgHksitQndkctOvnxBk3_AQa0Ie@mail.gmail.com>
Message-ID: <hv2arf$n45$1@dough.gmane.org>

Am 13.06.2010 07:10, schrieb James Bennett:
> On Fri, Jun 11, 2010 at 5:27 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>>> What is it now, just a straight WSGI app?
>>
>> No, FCGI.
> 
> Statements like this lead me to believe that ignoring Joel Spolsky
> would be the right thing to do.
> 
> Right now the PyPI codebase seems to have a bus number[1] of one:
> Martin, who is apparently the only person who really understands the
> code well enough to do significant work on it.

JFTR, I had a look at the code at PyCon last year and I could find my
way around it quite quickly.  It's not like PyPI is such a big codebase
that you need a year to get familiar with it.

This is of course not an argument against a rewrite, but the situation
is certainly not as gloomy as it is painted here from time to time.

Georg


-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From solipsis at pitrou.net  Sun Jun 13 13:20:05 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 13 Jun 2010 11:20:05 +0000 (UTC)
Subject: [Catalog-sig] PyPI down again...
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>	<4C12A2E4.2090305@v.loewis.de>	<AANLkTilvHoSVcVpExgbsOe0bqOYir4A3LZoIo794_tSV@mail.gmail.com>	<4C12B862.9000503@v.loewis.de>
	<AANLkTimzi4RSjfYKfpgHksitQndkctOvnxBk3_AQa0Ie@mail.gmail.com>
	<4C14A26C.2050804@v.loewis.de>
Message-ID: <loom.20100613T131752-743@post.gmane.org>

Martin v. L?wis <martin <at> v.loewis.de> writes:
> 
> I don't really want to "sell" the code base, but just for the record:
> It's written "in" WSGI, Zope Page Templates, and Postgres. These are
> all things that are well-understood in the Python web community.
> 
[...]
> 
> It would be really up to Richard Jones, and he said he would prefer
> Django; so do I.

I'm saying this from (far) outside the playground and am not intending to
contribute, so just take this as a suggestion, but: if it has to be rewritten,
how about doing in Python 3?

Regards

Antoine.



From mal at egenix.com  Sun Jun 13 14:49:57 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Sun, 13 Jun 2010 14:49:57 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTimzi4RSjfYKfpgHksitQndkctOvnxBk3_AQa0Ie@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>	<4C12A2E4.2090305@v.loewis.de>	<AANLkTilvHoSVcVpExgbsOe0bqOYir4A3LZoIo794_tSV@mail.gmail.com>	<4C12B862.9000503@v.loewis.de>
	<AANLkTimzi4RSjfYKfpgHksitQndkctOvnxBk3_AQa0Ie@mail.gmail.com>
Message-ID: <4C14D3F5.9030001@egenix.com>

James Bennett wrote:
> On Fri, Jun 11, 2010 at 5:27 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>>> What is it now, just a straight WSGI app?
>>
>> No, FCGI.
> 
> Statements like this lead me to believe that ignoring Joel Spolsky
> would be the right thing to do.
> 
> Right now the PyPI codebase seems to have a bus number[1] of one:
> Martin, who is apparently the only person who really understands the
> code well enough to do significant work on it. This is something which
> could be remedied by having more people learn the code and get
> familiar enough with it to make contributions, but that's complicated
> by the fact that PyPI still does so much basically from scratch -- it
> doesn't even use the standard gateway interface Python web developers
> are expected to be familiar with, much less any well-known libraries.
> 
> As such, just having people learn the code doesn't seem like a great
> option; for one thing, existing knowledge of Python web development
> isn't transferrable to PyPI, and working on PyPI isn't transferrable
> to anything else a Python web developer would be doing, and so it's
> unlikely that many, if any, people would be sufficiently motivated.
> Which points to rewriting as the best option, resulting in greater
> innate maintainability and a larger community of potential
> contributors.

Why don't you just start such a project, flesh out the details,
use the existing PyPI as reference for the APIs and then propose
that we use the new code for running PyPI ?

I think that if someone wants to do a rewrite it's best to just
let them decide about the choice of technology. Even if it doesn't
get used for PyPI in the end, it will still be a alternative
choice for local PyPI-style indexes for projects like Zope or
Plone to use, so work is not lost.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 13 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                35 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From mal at egenix.com  Sun Jun 13 14:58:38 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Sun, 13 Jun 2010 14:58:38 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTim0twWHBk4ScJDF-0ew-8SNtAjB1xLTeOWiJQBe@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de>	<AANLkTilvHoSVcVpExgbsOe0bqOYir4A3LZoIo794_tSV@mail.gmail.com>
	<4C12B862.9000503@v.loewis.de>	<AANLkTilC2kSUQr1Du6fdtCYaLUVmLsYbXFcgcc7Z9poP@mail.gmail.com>
	<4C12C07F.6000706@v.loewis.de>	<AANLkTils7kGRLgxdwZrtBVcqRdcJIbvcaG0ITivpVCtT@mail.gmail.com>
	<4C1338A6.1020601@v.loewis.de>	<AANLkTincczw85Mp9eOvBue3da2qGRRcRdxip4EOtJzvg@mail.gmail.com>
	<4C13450A.9050104@v.loewis.de>
	<AANLkTim0twWHBk4ScJDF-0ew-8SNtAjB1xLTeOWiJQBe@mail.gmail.com>
Message-ID: <4C14D5FE.1070909@egenix.com>

Justin Ryan wrote:
> [...lots of disrespectful and rude words...]

Justin, you just disqualified yourself from being accepted as a
respected member of this group.

I think Martin deserves a public apology from you.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 13 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                35 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From tseaver at palladion.com  Sun Jun 13 15:05:41 2010
From: tseaver at palladion.com (Tres Seaver)
Date: Sun, 13 Jun 2010 09:05:41 -0400
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <loom.20100613T131752-743@post.gmane.org>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>	<4C12A2E4.2090305@v.loewis.de>	<AANLkTilvHoSVcVpExgbsOe0bqOYir4A3LZoIo794_tSV@mail.gmail.com>	<4C12B862.9000503@v.loewis.de>	<AANLkTimzi4RSjfYKfpgHksitQndkctOvnxBk3_AQa0Ie@mail.gmail.com>	<4C14A26C.2050804@v.loewis.de>
	<loom.20100613T131752-743@post.gmane.org>
Message-ID: <hv2l34$hfu$2@dough.gmane.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Antoine Pitrou wrote:
> Martin v. L?wis <martin <at> v.loewis.de> writes:
>> I don't really want to "sell" the code base, but just for the record:
>> It's written "in" WSGI, Zope Page Templates, and Postgres. These are
>> all things that are well-understood in the Python web community.
>>
> [...]
>> It would be really up to Richard Jones, and he said he would prefer
>> Django; so do I.
> 
> I'm saying this from (far) outside the playground and am not intending to
> contribute, so just take this as a suggestion, but: if it has to be rewritten,
> how about doing in Python 3?

Such a choice would be contrary to the goal of keeping it in the "well
known Python web technologies" swimlane, to ease support by folks
already familiar with WSGI, etc.:  none of the libraries / frameworks
are ported yet.


Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkwU16UACgkQ+gerLs4ltQ6opQCgqZj2gM6W/2YxJYYx8rO6Tb1Q
0/kAn07E7MPnUu3sCmFIIW+u+a2GXf3c
=1q8f
-----END PGP SIGNATURE-----


From mal at egenix.com  Sun Jun 13 15:11:04 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Sun, 13 Jun 2010 15:11:04 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>	<4C12A2E4.2090305@v.loewis.de>
	<4C12A54D.1070406@egenix.com>
	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>
Message-ID: <4C14D8E8.4010903@egenix.com>

Tarek Ziad? wrote:
> On Fri, Jun 11, 2010 at 11:06 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>> "Martin v. L?wis" wrote:
>>> For a smaller project, start putting mirror support into setuptools or
>>> distribute; this would make short (several hours) outages less severe
>>> for the class of users that want permanent availability for downloading.
>>> It's unlikely that the mirrors would break when the master goes down;
>>> they just stop mirroring.
>>
>> A better and cleaner strategy is to put the static PyPI information
>> up on Amazon Cloudscape and have DNS take care of providing local
>> mirrors (edge servers) to setuptools et al.
>>
>> Such a setup won't require any complicated mirror logic in any
>> of the existing client tools.
>>
>> By moving the PyPI installation to Amazon AWS, we could also
>> get the RPC access distributed to more than just one server.
>>
>> As I said before, the PSF infrastructure committee needs to get on
>> of the job of getting this implemented (including funding this
>> development).
>>
>> If someone wants to volunteer helping with the setup, please contact
>> the PSF at psf at python.org.
> 
> What about continuing the work that was started last year ?
> (and not finished due to a lack of time)
> 
> There's a PEP we have started about a mirroring infrastructure:
> http://www.python.org/dev/peps/pep-0381/
> 
> Some of its parts are already implemented in PyPI, and
> what we need now is to work on the client side (pip, distribute, etc)
> and bootstrap one or two mirrors using the protocol.

We've had some private discussions about this, so I'm just
going to summarize...

The idea here is not to override the mirror PEP ideas,
but to use the existing PyPI installation and put the
content needed for the most widely distributed package tool
(currently setuptools and zc.buildout) on a content
delivery network (CDN) in order to have it highly available
on a managed edge network.

Amazon Cloudfront is such a CDN and has Python interfaces,
hence the idea to use Cloudfront.

I asked for volunteers, because I didn't know enough about
Amazon Cloudfront to write up a proposal and don't have
the cycles available to implement such a setup myself.

In the meantime, I've done some research and now know
enough to write a proposal for the PSF board to consider.
If the board thinks it's a good idea, we'll need to
pursue finding volunteers to implement it.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 13 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                35 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From solipsis at pitrou.net  Sun Jun 13 15:11:52 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 13 Jun 2010 13:11:52 +0000 (UTC)
Subject: [Catalog-sig] PyPI down again...
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>	<4C12A2E4.2090305@v.loewis.de>	<AANLkTilvHoSVcVpExgbsOe0bqOYir4A3LZoIo794_tSV@mail.gmail.com>	<4C12B862.9000503@v.loewis.de>	<AANLkTimzi4RSjfYKfpgHksitQndkctOvnxBk3_AQa0Ie@mail.gmail.com>	<4C14A26C.2050804@v.loewis.de>
	<loom.20100613T131752-743@post.gmane.org>
	<hv2l34$hfu$2@dough.gmane.org>
Message-ID: <loom.20100613T150922-700@post.gmane.org>

Tres Seaver <tseaver <at> palladion.com> writes:
> 
> > I'm saying this from (far) outside the playground and am not intending to
> > contribute, so just take this as a suggestion, but: if it has to be rewritten
,
> > how about doing in Python 3?
> 
> Such a choice would be contrary to the goal of keeping it in the "well
> known Python web technologies" swimlane, to ease support by folks
> already familiar with WSGI, etc.:  none of the libraries / frameworks
> are ported yet.

SQLAlchemy and other libraries have been ported (as well as mod_wsgi).
No major framework appears to have been ported, though.

Regards

Antoine.



From ziade.tarek at gmail.com  Sun Jun 13 17:14:13 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Sun, 13 Jun 2010 17:14:13 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <hv2arf$n45$1@dough.gmane.org>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de>
	<AANLkTilvHoSVcVpExgbsOe0bqOYir4A3LZoIo794_tSV@mail.gmail.com>
	<4C12B862.9000503@v.loewis.de>
	<AANLkTimzi4RSjfYKfpgHksitQndkctOvnxBk3_AQa0Ie@mail.gmail.com>
	<hv2arf$n45$1@dough.gmane.org>
Message-ID: <AANLkTikWA79JPqvahr9O9T8uZBgFYvxr_BzdR0qtH81V@mail.gmail.com>

On Sun, Jun 13, 2010 at 12:09 PM, Georg Brandl <g.brandl at gmx.net> wrote:
> Am 13.06.2010 07:10, schrieb James Bennett:
>> On Fri, Jun 11, 2010 at 5:27 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>>>> What is it now, just a straight WSGI app?
>>>
>>> No, FCGI.
>>
>> Statements like this lead me to believe that ignoring Joel Spolsky
>> would be the right thing to do.
>>
>> Right now the PyPI codebase seems to have a bus number[1] of one:
>> Martin, who is apparently the only person who really understands the
>> code well enough to do significant work on it.
>
> JFTR, I had a look at the code at PyCon last year and I could find my
> way around it quite quickly. ?It's not like PyPI is such a big codebase
> that you need a year to get familiar with it.
>
> This is of course not an argument against a rewrite, but the situation
> is certainly not as gloomy as it is painted here from time to time.

+1

I've written several patches and didn't have a problem understanding it
>
> Georg
>
>
> --
> Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
> Four shall be the number of spaces thou shalt indent, and the number of thy
> indenting shall be four. Eight shalt thou not indent, nor either indent thou
> two, excepting that thou then proceed to four. Tabs are right out.
>
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>



-- 
Tarek Ziad? | http://ziade.org

From ziade.tarek at gmail.com  Sun Jun 13 17:26:47 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Sun, 13 Jun 2010 17:26:47 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C14D8E8.4010903@egenix.com>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de> <4C12A54D.1070406@egenix.com>
	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>
	<4C14D8E8.4010903@egenix.com>
Message-ID: <AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>

On Sun, Jun 13, 2010 at 3:11 PM, M.-A. Lemburg <mal at egenix.com> wrote:
...
>
> We've had some private discussions about this, so I'm just
> going to summarize...
>
> The idea here is not to override the mirror PEP ideas,
> but to use the existing PyPI installation and put the
> content needed for the most widely distributed package tool
> (currently setuptools and zc.buildout) on a content
> delivery network (CDN) in order to have it highly available
> on a managed edge network.

I think it overlaps a bit the PEP goal, which is to set up a network of mirrors,
and have them listed in the PyPI DNS so clients can switch from one mirror
to another.(and even do geoloc!)

Right now we already have "unofficial mirrors" and the idea of the PEP
would be to list them officially at PyPI and to have them collect the
stats so we cant count download hits.

> Amazon Cloudfront is such a CDN and has Python interfaces,
> hence the idea to use Cloudfront.
>
> I asked for volunteers, because I didn't know enough about
> Amazon Cloudfront to write up a proposal and don't have
> the cycles available to implement such a setup myself.
>
> In the meantime, I've done some research and now know
> enough to write a proposal for the PSF board to consider.
> If the board thinks it's a good idea, we'll need to
> pursue finding volunteers to implement it.

Well maybe this is the best path to follow right now, as it will be done faster,
without having to interact with much people to set it up, so it's a quick win.

But it will probably kill the mirroring protocol idea from the PEP in
the process,
which I think is superior in the long term since it provides a
standardized ground
for the community to set up mirrors independently from pypi.python.org.

Regards
Tarek
-- 
Tarek Ziad? | http://ziade.org

From ziade.tarek at gmail.com  Sun Jun 13 17:36:00 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Sun, 13 Jun 2010 17:36:00 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C14A26C.2050804@v.loewis.de>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de>
	<AANLkTilvHoSVcVpExgbsOe0bqOYir4A3LZoIo794_tSV@mail.gmail.com>
	<4C12B862.9000503@v.loewis.de>
	<AANLkTimzi4RSjfYKfpgHksitQndkctOvnxBk3_AQa0Ie@mail.gmail.com>
	<4C14A26C.2050804@v.loewis.de>
Message-ID: <AANLkTiku6rqau5gliTDnu9mi-rL1ePOCzkDWbXeplzdr@mail.gmail.com>

On Sun, Jun 13, 2010 at 11:18 AM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> Right now the PyPI codebase seems to have a bus number[1] of one:
>> Martin, who is apparently the only person who really understands the
>> code well enough to do significant work on it. This is something which
>> could be remedied by having more people learn the code and get
>> familiar enough with it to make contributions, but that's complicated
>> by the fact that PyPI still does so much basically from scratch -- it
>> doesn't even use the standard gateway interface Python web developers
>> are expected to be familiar with, much less any well-known libraries.
>
> There are several ways to run PyPI, including WSGI, FCGI, CGI, and a
> stand-alone server. The mode which is used on PyPI just happens to be FCGI.
>
> I'm not sure how the integration with Apache matters - the actual code
> generating web pages is the same all the time, no matter what gateway
> interface is being used.
>
> As for the bus number: Richard Jones is also familiar with the code, as he
> wrote it in the first place. He just didn't contribute much lately.
> I believe Tarek is also knowledgable. So the bus factor is rather 3.

I am pretty confident now with the code, and I don't think it's very complex.
It just grew big in some parts, like webui.py which needs to be splited.

Frankly, I think it just needs a bit of cleanup, maybe a migration to SQLAchemy
but that's it.  As a matter of fact; some folks in the Montreal Python
user group are working on
refactoring it right now, because they wanted to provide some new features.

So I would be 0- on writing it from scratch.

I'd suggest to move it to a DVCS (hg.python.org?) to make the contributions
easier.

Regards
Tarek

-- 
Tarek Ziad? | http://ziade.org

From g.brandl at gmx.net  Sun Jun 13 17:53:29 2010
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 13 Jun 2010 17:53:29 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <loom.20100613T150922-700@post.gmane.org>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>	<4C12A2E4.2090305@v.loewis.de>	<AANLkTilvHoSVcVpExgbsOe0bqOYir4A3LZoIo794_tSV@mail.gmail.com>	<4C12B862.9000503@v.loewis.de>	<AANLkTimzi4RSjfYKfpgHksitQndkctOvnxBk3_AQa0Ie@mail.gmail.com>	<4C14A26C.2050804@v.loewis.de>	<loom.20100613T131752-743@post.gmane.org>	<hv2l34$hfu$2@dough.gmane.org>
	<loom.20100613T150922-700@post.gmane.org>
Message-ID: <hv2v0c$khj$1@dough.gmane.org>

Am 13.06.2010 15:11, schrieb Antoine Pitrou:
> Tres Seaver <tseaver <at> palladion.com> writes:
>> 
>> > I'm saying this from (far) outside the playground and am not intending to
>> > contribute, so just take this as a suggestion, but: if it has to be rewritten
> ,
>> > how about doing in Python 3?
>> 
>> Such a choice would be contrary to the goal of keeping it in the "well
>> known Python web technologies" swimlane, to ease support by folks
>> already familiar with WSGI, etc.:  none of the libraries / frameworks
>> are ported yet.
> 
> SQLAlchemy and other libraries have been ported (as well as mod_wsgi).
> No major framework appears to have been ported, though.

That's also because last I heard there was no consensus yet how WSGI would
look like on Python 3.  But that would be on-topic for web-SIG.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From martin at v.loewis.de  Sun Jun 13 19:33:56 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sun, 13 Jun 2010 19:33:56 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <loom.20100613T131752-743@post.gmane.org>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>	<4C12A2E4.2090305@v.loewis.de>	<AANLkTilvHoSVcVpExgbsOe0bqOYir4A3LZoIo794_tSV@mail.gmail.com>	<4C12B862.9000503@v.loewis.de>	<AANLkTimzi4RSjfYKfpgHksitQndkctOvnxBk3_AQa0Ie@mail.gmail.com>	<4C14A26C.2050804@v.loewis.de>
	<loom.20100613T131752-743@post.gmane.org>
Message-ID: <4C151684.3070808@v.loewis.de>

> I'm saying this from (far) outside the playground and am not intending to
> contribute, so just take this as a suggestion, but: if it has to be rewritten,
> how about doing in Python 3?

It wouldn't really matter, so: +0.

Regards,
Martin

From martin at v.loewis.de  Sun Jun 13 19:40:20 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 13 Jun 2010 19:40:20 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>	<4C12A2E4.2090305@v.loewis.de>
	<4C12A54D.1070406@egenix.com>	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>	<4C14D8E8.4010903@egenix.com>
	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>
Message-ID: <4C151804.6050903@v.loewis.de>

> I think it overlaps a bit the PEP goal, which is to set up a network of mirrors,
> and have them listed in the PyPI DNS  so clients can switch from one mirror
> to another.(and even do geoloc!)

JFTR, this already exists. a.mirrors.pypi.python.org and 
b.mirrors.pypi.python.org are already there and could be used by clients.

> Well maybe this is the best path to follow right now, as it will be done faster,
> without having to interact with much people to set it up, so it's a quick win.

My main worry (besides the client integration) is statistics: I do want 
to get download statistics. So anybody implementing it would have to 
find a way of fetching the download numbers from Amazon.

> But it will probably kill the mirroring protocol idea from the PEP in
> the process,
> which I think is superior in the long term since it provides a
> standardized ground
> for the community to set up mirrors independently from pypi.python.org.

I also remain skeptical that this cloud idea is useful at all. Amazon
Cloudfront is a *beta* service. So they aren't sure themselves whether 
it works correctly - and there have been reports about two-day outages 
of EC2, for bitbucket.org. There also have been complaints about the 
available bandwidth. So I'm not sure whether replacing a single point of 
failure with a different one is actually improving anything.

Regards,
Martin

From ziade.tarek at gmail.com  Sun Jun 13 22:06:10 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Sun, 13 Jun 2010 22:06:10 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C151804.6050903@v.loewis.de>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de> <4C12A54D.1070406@egenix.com>
	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>
	<4C14D8E8.4010903@egenix.com>
	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>
	<4C151804.6050903@v.loewis.de>
Message-ID: <AANLkTikq6lDLrIF2oFqjBXjgJM02HwFCgiaaCHnq7eJv@mail.gmail.com>

2010/6/13 "Martin v. L?wis" <martin at v.loewis.de>:
>> I think it overlaps a bit the PEP goal, which is to set up a network of
>> mirrors,
>> and have them listed in the PyPI DNS ?so clients can switch from one
>> mirror
>> to another.(and even do geoloc!)
>
> JFTR, this already exists. a.mirrors.pypi.python.org and
> b.mirrors.pypi.python.org are already there and could be used by clients.

I wasn't aware of these mirrors. Do you maintain them ? how are they
synchronized ?
Do you get the statistics if we use them ?

If so, we could start to use them in all clients asap. (as fallbacks
if PyPI gets down)

>
>> Well maybe this is the best path to follow right now, as it will be done
>> faster,
>> without having to interact with much people to set it up, so it's a quick
>> win.
>
> My main worry (besides the client integration) is statistics: I do want to
> get download statistics. So anybody implementing it would have to find a way
> of fetching the download numbers from Amazon.
>
>> But it will probably kill the mirroring protocol idea from the PEP in
>> the process,
>> which I think is superior in the long term since it provides a
>> standardized ground
>> for the community to set up mirrors independently from pypi.python.org.
>
> I also remain skeptical that this cloud idea is useful at all. Amazon
> Cloudfront is a *beta* service. So they aren't sure themselves whether it
> works correctly - and there have been reports about two-day outages of EC2,
> for bitbucket.org. There also have been complaints about the available
> bandwidth. So I'm not sure whether replacing a single point of failure with
> a different one is actually improving anything.

ISTM that the workload is the same, whether a cloud or a regular mirror is used,
because of the statistics.  FIY, the work to be done for the mirrors,
beside the PEP
editing consist of :

- implementing the extra pages generation + stats builder in a package
like z3c.pypimirror
  (http://pypi.python.org/pypi/z3c.pypimirror) which is used by several mirrors.

- adding the client-side code in a project like Distribute or Pp


Regards
Tarek



-- 
Tarek Ziad? | http://ziade.org

From martin at v.loewis.de  Sun Jun 13 22:19:09 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 13 Jun 2010 22:19:09 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTikq6lDLrIF2oFqjBXjgJM02HwFCgiaaCHnq7eJv@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>	<4C12A2E4.2090305@v.loewis.de>	<4C12A54D.1070406@egenix.com>	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>	<4C14D8E8.4010903@egenix.com>	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>	<4C151804.6050903@v.loewis.de>
	<AANLkTikq6lDLrIF2oFqjBXjgJM02HwFCgiaaCHnq7eJv@mail.gmail.com>
Message-ID: <4C153D3D.7050604@v.loewis.de>

>> JFTR, this already exists. a.mirrors.pypi.python.org and
>> b.mirrors.pypi.python.org are already there and could be used by clients.
>
> I wasn't aware of these mirrors. Do you maintain them ? how are they
> synchronized ?

Yes, using pep381client. Notice that a.mirrors is dinsdale itself,
so there is only a single mirror.

> Do you get the statistics if we use them ?

Not yet; that's not implemented yet. More specifically, I get the 
statistics, but the log files are not yet processed.

> If so, we could start to use them in all clients asap. (as fallbacks
> if PyPI gets down)

As I said: setuptools and distribute should start supporting PEP 381,
as an experimental feature.

> - adding the client-side code in a project like Distribute or Pp

I think MAL believes that this would not be necessary if the Amazon 
service would be used; I remain skeptical. I'd rather have the clients 
try explicitly (and indicate to the user that they are using a mirror).

Regards,
Martin

From mal at egenix.com  Mon Jun 14 11:12:07 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 14 Jun 2010 11:12:07 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C151804.6050903@v.loewis.de>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>	<4C12A2E4.2090305@v.loewis.de>	<4C12A54D.1070406@egenix.com>	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>	<4C14D8E8.4010903@egenix.com>	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>
	<4C151804.6050903@v.loewis.de>
Message-ID: <4C15F267.6040108@egenix.com>

"Martin v. L?wis" wrote:
>> I think it overlaps a bit the PEP goal, which is to set up a network
>> of mirrors,
>> and have them listed in the PyPI DNS  so clients can switch from one
>> mirror
>> to another.(and even do geoloc!)
> 
> JFTR, this already exists. a.mirrors.pypi.python.org and
> b.mirrors.pypi.python.org are already there and could be used by clients.
> 
>> Well maybe this is the best path to follow right now, as it will be
>> done faster,
>> without having to interact with much people to set it up, so it's a
>> quick win.
> 
> My main worry (besides the client integration) is statistics: I do want
> to get download statistics. So anybody implementing it would have to
> find a way of fetching the download numbers from Amazon.

Download statistics are readily available from Amazon Cloudfront,
so no worries: you'll get statistics for all edge server downloads.

http://docs.amazonwebservices.com/AmazonCloudFront/latest/DeveloperGuide/index.html?AccessLogs.html

>> But it will probably kill the mirroring protocol idea from the PEP in
>> the process,
>> which I think is superior in the long term since it provides a
>> standardized ground
>> for the community to set up mirrors independently from pypi.python.org.
> 
> I also remain skeptical that this cloud idea is useful at all. Amazon
> Cloudfront is a *beta* service. So they aren't sure themselves whether
> it works correctly - and there have been reports about two-day outages
> of EC2, for bitbucket.org. There also have been complaints about the
> available bandwidth. So I'm not sure whether replacing a single point of
> failure with a different one is actually improving anything.

Amazon Cloudfront uses S3 as basis for the service, S3 has been
around for years and has a very stable uptime:

http://www.readwriteweb.com/archives/amazon_s3_exceeds_9999_percent_uptime.php

Cloudfront itself has been around since Nov 2008.

You can check their current online status using this panel:

http://status.aws.amazon.com/

Apart from the gained availability and outsourced management,
we'd also get faster downloads in most parts of the world,
due to the local caching Cloudfront is applying (and this can
be used to further increase the availability, since we can
control the expiry time of those local copies).

So in summary we are replacing a single point of failure with N
points of failure (with N being the number of edge caching
servers they use).

Regaring the bitbucket problem you mentioned:

EC2 is their virtual server service, which we don't use. The
bitbucket problems originated from a) a DDoS attack on their
virtual servers running on EC2 and b) a problem with the
Amazon EBS, which is their virtualized SAN, and was related
to the way the DDoS was done (EBS and the DDoS attack both
used UDP):

http://blog.bitbucket.org/2009/10/04/on-our-extended-downtime-amazon-and-whats-coming/
"""
And to re-iterate, the problem wasn?t really Amazon EC2 or EBS, it was isolated to our case, due to
the nature of the attack.
"""

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 14 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                34 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From mal at egenix.com  Mon Jun 14 11:27:15 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 14 Jun 2010 11:27:15 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>	<4C12A2E4.2090305@v.loewis.de>
	<4C12A54D.1070406@egenix.com>	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>	<4C14D8E8.4010903@egenix.com>
	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>
Message-ID: <4C15F5F3.40501@egenix.com>

Tarek Ziad? wrote:
> On Sun, Jun 13, 2010 at 3:11 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> ...
>>
>> We've had some private discussions about this, so I'm just
>> going to summarize...
>>
>> The idea here is not to override the mirror PEP ideas,
>> but to use the existing PyPI installation and put the
>> content needed for the most widely distributed package tool
>> (currently setuptools and zc.buildout) on a content
>> delivery network (CDN) in order to have it highly available
>> on a managed edge network.
> 
> I think it overlaps a bit the PEP goal, which is to set up a network of mirrors,
> and have them listed in the PyPI DNS so clients can switch from one mirror
> to another.(and even do geoloc!)
> 
> Right now we already have "unofficial mirrors" and the idea of the PEP
> would be to list them officially at PyPI and to have them collect the
> stats so we cant count download hits.

Note that the CDN does not mirror the content of PyPI, it
just takes care of delivering the requested data to the
various edge servers and caching it there for a while.

This is a different concept than that of a full mirror that
doesn't work like a cache, but instead provides a fully
functional standalone server.

I still think that the concept of being able to mirror PyPI
servers is a useful one.

>> Amazon Cloudfront is such a CDN and has Python interfaces,
>> hence the idea to use Cloudfront.
>>
>> I asked for volunteers, because I didn't know enough about
>> Amazon Cloudfront to write up a proposal and don't have
>> the cycles available to implement such a setup myself.
>>
>> In the meantime, I've done some research and now know
>> enough to write a proposal for the PSF board to consider.
>> If the board thinks it's a good idea, we'll need to
>> pursue finding volunteers to implement it.
> 
> Well maybe this is the best path to follow right now, as it will be done faster,
> without having to interact with much people to set it up, so it's a quick win.
> 
> But it will probably kill the mirroring protocol idea from the PEP in
> the process,
> which I think is superior in the long term since it provides a
> standardized ground
> for the community to set up mirrors independently from pypi.python.org.

We'll have to see.

Note that the CDN will only deal with the static data on PyPI,
not the RPC or the web GUI access.

Since static data is all that setuptools et al. currently use
for fetching the data, we'll see an improved uptime for easy_install
and esp. zc.buildout which by nature of their concepts rely on having
a high availability of the PyPI static data resources.

If, in the future, package tools start to rely on RPC for
fetching data, the situation will shift towards needing full
functional mirrors again.

OTOH, we could also provide a snapshot copy of the database
data in form of a SQLite database on the CDN for those tools
to download and use locally... there are lot's of things
package tools could do :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 14 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                34 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From marrakis at gmail.com  Mon Jun 14 12:35:08 2010
From: marrakis at gmail.com (Mathieu Leduc-Hamel)
Date: Mon, 14 Jun 2010 12:35:08 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C15F5F3.40501@egenix.com>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de> <4C12A54D.1070406@egenix.com>
	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>
	<4C14D8E8.4010903@egenix.com>
	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>
	<4C15F5F3.40501@egenix.com>
Message-ID: <AANLkTineOzOP7Bb9JiAVPulkExOOyLjP8yHTLzmJMPkc@mail.gmail.com>

To continue the discussion about a rewrite or a cleanup of the Pypi
codebase, I'm from Montreal-Python usergroup and I'm say that yes at the
first the current codebase of pypi seem to be very unclear and difficult to
maintain.

But it's not an impossible mission and we are currently in the process of:

- Adding functional test. The test coverage is now around 40% percent.
- When we'll reach a more complete coverage, we want to replace the psycopg
api by SQLAlchemy
- Replace many manual manipulation of the metadata by a more robust and
straightforward way of dealing with (distutils2 might be the option there)

At first I was thinking about rewriting everything using the chishop project
(an implementation of PyPi using django). But having the control of the code
source and not dependent of any framework is maybe a better idea.

More than, despite the frequent outage, pypi is working today, then just a
modernization of code base seem to be best idea.

By the wat, after a code review of tarek, a very useful thing might be to
find a better way to deal and implement contributions coming from community.
Right now Tarek is responsible of making the link between our effert and the
work of Martin but we don't have any official public mirror of the source
code and any roadmap.

On Mon, Jun 14, 2010 at 11:27 AM, M.-A. Lemburg <mal at egenix.com> wrote:

> Tarek Ziad? wrote:
> > On Sun, Jun 13, 2010 at 3:11 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> > ...
> >>
> >> We've had some private discussions about this, so I'm just
> >> going to summarize...
> >>
> >> The idea here is not to override the mirror PEP ideas,
> >> but to use the existing PyPI installation and put the
> >> content needed for the most widely distributed package tool
> >> (currently setuptools and zc.buildout) on a content
> >> delivery network (CDN) in order to have it highly available
> >> on a managed edge network.
> >
> > I think it overlaps a bit the PEP goal, which is to set up a network of
> mirrors,
> > and have them listed in the PyPI DNS so clients can switch from one
> mirror
> > to another.(and even do geoloc!)
> >
> > Right now we already have "unofficial mirrors" and the idea of the PEP
> > would be to list them officially at PyPI and to have them collect the
> > stats so we cant count download hits.
>
> Note that the CDN does not mirror the content of PyPI, it
> just takes care of delivering the requested data to the
> various edge servers and caching it there for a while.
>
> This is a different concept than that of a full mirror that
> doesn't work like a cache, but instead provides a fully
> functional standalone server.
>
> I still think that the concept of being able to mirror PyPI
> servers is a useful one.
>
> >> Amazon Cloudfront is such a CDN and has Python interfaces,
> >> hence the idea to use Cloudfront.
> >>
> >> I asked for volunteers, because I didn't know enough about
> >> Amazon Cloudfront to write up a proposal and don't have
> >> the cycles available to implement such a setup myself.
> >>
> >> In the meantime, I've done some research and now know
> >> enough to write a proposal for the PSF board to consider.
> >> If the board thinks it's a good idea, we'll need to
> >> pursue finding volunteers to implement it.
> >
> > Well maybe this is the best path to follow right now, as it will be done
> faster,
> > without having to interact with much people to set it up, so it's a quick
> win.
> >
> > But it will probably kill the mirroring protocol idea from the PEP in
> > the process,
> > which I think is superior in the long term since it provides a
> > standardized ground
> > for the community to set up mirrors independently from pypi.python.org.
>
> We'll have to see.
>
> Note that the CDN will only deal with the static data on PyPI,
> not the RPC or the web GUI access.
>
> Since static data is all that setuptools et al. currently use
> for fetching the data, we'll see an improved uptime for easy_install
> and esp. zc.buildout which by nature of their concepts rely on having
> a high availability of the PyPI static data resources.
>
> If, in the future, package tools start to rely on RPC for
> fetching data, the situation will shift towards needing full
> functional mirrors again.
>
> OTOH, we could also provide a snapshot copy of the database
> data in form of a SQLite database on the CDN for those tools
> to download and use locally... there are lot's of things
> package tools could do :-)
>
> --
> Marc-Andre Lemburg
> eGenix.com
>
> Professional Python Services directly from the Source  (#1, Jun 14 2010)
> >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
> >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
> >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
> ________________________________________________________________________
> 2010-07-19: EuroPython 2010, Birmingham, UK                34 days to go
>
> ::: Try our new mxODBC.Connect Python Database Interface for free ! ::::
>
>
>   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
>    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
>           Registered at Amtsgericht Duesseldorf: HRB 46611
>               http://www.egenix.com/company/contact/
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100614/f061635b/attachment.html>

From mark at geek.net  Mon Jun 14 13:50:56 2010
From: mark at geek.net (Mark Ramm)
Date: Mon, 14 Jun 2010 07:50:56 -0400
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C15F5F3.40501@egenix.com>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de> <4C12A54D.1070406@egenix.com>
	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>
	<4C14D8E8.4010903@egenix.com>
	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>
	<4C15F5F3.40501@egenix.com>
Message-ID: <AANLkTileCAZSsaWOkYrdAtOi6TAgZkUow3TYmBdmT9nO@mail.gmail.com>

> If, in the future, package tools start to rely on RPC for
> fetching data, the situation will shift towards needing full
> functional mirrors again.

Ideally we move some of this to be accessible via a more REST style
interface where http GET requests (which would be by far the most
common case) are still cacheable via all the standard mechanisms.

I'm not a REST evangelist in most cases, but when scale and
availability really do matter, REST buys you quite a bit by allowing
you to scale and cache in all the ways that the web does.

--Mark Ramm

From ametaireau at gmail.com  Mon Jun 14 16:02:50 2010
From: ametaireau at gmail.com (=?UTF-8?Q?Alexis_M=C3=A9taireau?=)
Date: Mon, 14 Jun 2010 16:02:50 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTileCAZSsaWOkYrdAtOi6TAgZkUow3TYmBdmT9nO@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com> 
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com> 
	<4C12A2E4.2090305@v.loewis.de> <4C12A54D.1070406@egenix.com> 
	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com> 
	<4C14D8E8.4010903@egenix.com>
	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com> 
	<4C15F5F3.40501@egenix.com>
	<AANLkTileCAZSsaWOkYrdAtOi6TAgZkUow3TYmBdmT9nO@mail.gmail.com>
Message-ID: <AANLkTikD97sRortV81717A01p4hDvmh8MbNfYZP83HxY@mail.gmail.com>

Hi all,

Distutils2 will bring two APIs to request PyPI, via the "simple" API and via
the XML-RPC one.

The fact is that the Simple API (it's just HTML pages, in a REST style as
pointed out by Mark)
does not provides all information we need, especially about distribution
dependencies or if we want to query some others things contained in the
metadatas.

I'm working on two simple APIs for that, and I'll probably make a wrapper
around both, wich could choose the right one to use, depending on the needs
(eg. don't always rely on RPC or on "REST").

As we are talking about refactoring PyPI, it will probably be nice to have a
real REST API, that talks JSON or XML, replacing the HTML pages actually
served on http://pypi.python.org/simple/ :)

Cheers,
Alexis

On Mon, Jun 14, 2010 at 1:50 PM, Mark Ramm <mark at geek.net> wrote:

> > If, in the future, package tools start to rely on RPC for
> > fetching data, the situation will shift towards needing full
> > functional mirrors again.
>
> Ideally we move some of this to be accessible via a more REST style
> interface where http GET requests (which would be by far the most
> common case) are still cacheable via all the standard mechanisms.
>
> I'm not a REST evangelist in most cases, but when scale and
> availability really do matter, REST buys you quite a bit by allowing
> you to scale and cache in all the ways that the web does.
>
> --Mark Ramm
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>



-- 
Alexis
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100614/67abae44/attachment-0001.html>

From mal at egenix.com  Mon Jun 14 16:12:49 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 14 Jun 2010 16:12:49 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTikD97sRortV81717A01p4hDvmh8MbNfYZP83HxY@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de> <4C12A54D.1070406@egenix.com>
	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>
	<4C14D8E8.4010903@egenix.com>	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>
	<4C15F5F3.40501@egenix.com>	<AANLkTileCAZSsaWOkYrdAtOi6TAgZkUow3TYmBdmT9nO@mail.gmail.com>
	<AANLkTikD97sRortV81717A01p4hDvmh8MbNfYZP83HxY@mail.gmail.com>
Message-ID: <4C1638E1.102@egenix.com>

When designing such interfaces, please consider that the PyPI information
is mostly static. If there's information missing, it should be easy to add
it to e.g. a new info file placed into the package's "simple" directory
that package tools could pick up in REST style.

Static directories just scale a lot better than any kind of (true) RPC
interface and offloading some work to the client is certainly
a good strategy as well.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 14 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                34 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


Alexis M?taireau wrote:
> Hi all,
> 
> Distutils2 will bring two APIs to request PyPI, via the "simple" API and via
> the XML-RPC one.
> 
> The fact is that the Simple API (it's just HTML pages, in a REST style as
> pointed out by Mark)
> does not provides all information we need, especially about distribution
> dependencies or if we want to query some others things contained in the
> metadatas.
> 
> I'm working on two simple APIs for that, and I'll probably make a wrapper
> around both, wich could choose the right one to use, depending on the needs
> (eg. don't always rely on RPC or on "REST").
> 
> As we are talking about refactoring PyPI, it will probably be nice to have a
> real REST API, that talks JSON or XML, replacing the HTML pages actually
> served on http://pypi.python.org/simple/ :)
> 
> Cheers,
> Alexis
> 
> On Mon, Jun 14, 2010 at 1:50 PM, Mark Ramm <mark at geek.net> wrote:
> 
>>> If, in the future, package tools start to rely on RPC for
>>> fetching data, the situation will shift towards needing full
>>> functional mirrors again.
>>
>> Ideally we move some of this to be accessible via a more REST style
>> interface where http GET requests (which would be by far the most
>> common case) are still cacheable via all the standard mechanisms.
>>
>> I'm not a REST evangelist in most cases, but when scale and
>> availability really do matter, REST buys you quite a bit by allowing
>> you to scale and cache in all the ways that the web does.
>>
>> --Mark Ramm
>> _______________________________________________
>> Catalog-SIG mailing list
>> Catalog-SIG at python.org
>> http://mail.python.org/mailman/listinfo/catalog-sig
>>
> 
> 
> 


From ametaireau at gmail.com  Mon Jun 14 17:00:58 2010
From: ametaireau at gmail.com (=?UTF-8?Q?Alexis_M=C3=A9taireau?=)
Date: Mon, 14 Jun 2010 17:00:58 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C1638E1.102@egenix.com>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com> 
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com> 
	<4C12A2E4.2090305@v.loewis.de> <4C12A54D.1070406@egenix.com> 
	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com> 
	<4C14D8E8.4010903@egenix.com>
	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com> 
	<4C15F5F3.40501@egenix.com>
	<AANLkTileCAZSsaWOkYrdAtOi6TAgZkUow3TYmBdmT9nO@mail.gmail.com> 
	<AANLkTikD97sRortV81717A01p4hDvmh8MbNfYZP83HxY@mail.gmail.com> 
	<4C1638E1.102@egenix.com>
Message-ID: <AANLkTimknAMOb8YhGLfdrZ-YYIsIIKlPeoTpK-aQ0YmE@mail.gmail.com>

On Mon, Jun 14, 2010 at 4:12 PM, M.-A. Lemburg <mal at egenix.com> wrote:

> When designing such interfaces, please consider that the PyPI information
> is mostly static. If there's information missing, it should be easy to add
> it to e.g. a new info file placed into the package's "simple" directory
> that package tools could pick up in REST style.
>

Yes, it can solve some problems pointed out here, and I'll consider that.
*but* it's not a solution to all problems, and RPC calls will be of a great
help in some cases, as it could be very long to fetch all metadata, process
it on the client side and return information about eg. a search by other
fields than name (give me all distributions from this author).

But, definitively, yes, I'll consider that.

Thanks !
-- 
Alexis
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100614/1670d8dd/attachment.html>

From mal at egenix.com  Mon Jun 14 17:06:50 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 14 Jun 2010 17:06:50 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTimknAMOb8YhGLfdrZ-YYIsIIKlPeoTpK-aQ0YmE@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de> <4C12A54D.1070406@egenix.com>
	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>
	<4C14D8E8.4010903@egenix.com>	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>
	<4C15F5F3.40501@egenix.com>	<AANLkTileCAZSsaWOkYrdAtOi6TAgZkUow3TYmBdmT9nO@mail.gmail.com>
	<AANLkTikD97sRortV81717A01p4hDvmh8MbNfYZP83HxY@mail.gmail.com>
	<4C1638E1.102@egenix.com>
	<AANLkTimknAMOb8YhGLfdrZ-YYIsIIKlPeoTpK-aQ0YmE@mail.gmail.com>
Message-ID: <4C16458A.4070001@egenix.com>

Alexis M?taireau wrote:
> On Mon, Jun 14, 2010 at 4:12 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> 
>> When designing such interfaces, please consider that the PyPI information
>> is mostly static. If there's information missing, it should be easy to add
>> it to e.g. a new info file placed into the package's "simple" directory
>> that package tools could pick up in REST style.
>>
> 
> Yes, it can solve some problems pointed out here, and I'll consider that.
> *but* it's not a solution to all problems, and RPC calls will be of a great
> help in some cases, as it could be very long to fetch all metadata, process
> it on the client side and return information about eg. a search by other
> fields than name (give me all distributions from this author).

Agreed, that's why I think it would be useful to simply put
all meta data into a SQLite database file and ship that as
static file as well. Local clients could then download the
database file (probably only a few MB) and work on it locally.

This would also make searches in PyPI a lot faster... not only
because searches could be done locally, but also because the
server wouldn't have to handle the load of those searches from
hundreds of clients.

> But, definitively, yes, I'll consider that.

Thanks !

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 14 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                34 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From marrakis at gmail.com  Mon Jun 14 17:14:11 2010
From: marrakis at gmail.com (Mathieu Leduc-Hamel)
Date: Mon, 14 Jun 2010 17:14:11 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C16458A.4070001@egenix.com>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de> <4C12A54D.1070406@egenix.com>
	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>
	<4C14D8E8.4010903@egenix.com>
	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>
	<4C15F5F3.40501@egenix.com>
	<AANLkTileCAZSsaWOkYrdAtOi6TAgZkUow3TYmBdmT9nO@mail.gmail.com>
	<AANLkTikD97sRortV81717A01p4hDvmh8MbNfYZP83HxY@mail.gmail.com>
	<4C1638E1.102@egenix.com>
	<AANLkTimknAMOb8YhGLfdrZ-YYIsIIKlPeoTpK-aQ0YmE@mail.gmail.com>
	<4C16458A.4070001@egenix.com>
Message-ID: <AANLkTim_upBVJeQVGP2Mas71szWJ2eQH24KmqUoO8Clu@mail.gmail.com>

>
> Agreed, that's why I think it would be useful to simply put
> all meta data into a SQLite database file and ship that as
> static file as well. Local clients could then download the
> database file (probably only a few MB) and work on it locally.
>
> I don't think it would be easy to do that right now since the database
store more informations than only the metadata of the all packages, you
don't wanna give all the informations about users accounts by example...

And, I don't understand how it can be perform with always updated
informations like ones on the pypi website, your database is always updated,
than it's not possible to have a completely updated one...

Search might not by the big deal if the data and a good cache interface is
implemented, the number of parallel connexion is something else...

This would also make searches in PyPI a lot faster... not only
> because searches could be done locally, but also because the
> server wouldn't have to handle the load of those searches from
> hundreds of clients.
>
> > But, definitively, yes, I'll consider that.
>
> Thanks !
>
> --
> Marc-Andre Lemburg
> eGenix.com
>
> Professional Python Services directly from the Source  (#1, Jun 14 2010)
> >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
> >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
> >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
> ________________________________________________________________________
> 2010-07-19: EuroPython 2010, Birmingham, UK                34 days to go
>
> ::: Try our new mxODBC.Connect Python Database Interface for free ! ::::
>
>
>   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
>    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
>           Registered at Amtsgericht Duesseldorf: HRB 46611
>               http://www.egenix.com/company/contact/
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100614/8db42322/attachment-0001.html>

From mal at egenix.com  Mon Jun 14 17:22:48 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 14 Jun 2010 17:22:48 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTim_upBVJeQVGP2Mas71szWJ2eQH24KmqUoO8Clu@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>	<4C12A2E4.2090305@v.loewis.de>
	<4C12A54D.1070406@egenix.com>	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>	<4C14D8E8.4010903@egenix.com>	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>	<4C15F5F3.40501@egenix.com>	<AANLkTileCAZSsaWOkYrdAtOi6TAgZkUow3TYmBdmT9nO@mail.gmail.com>	<AANLkTikD97sRortV81717A01p4hDvmh8MbNfYZP83HxY@mail.gmail.com>	<4C1638E1.102@egenix.com>	<AANLkTimknAMOb8YhGLfdrZ-YYIsIIKlPeoTpK-aQ0YmE@mail.gmail.com>	<4C16458A.4070001@egenix.com>
	<AANLkTim_upBVJeQVGP2Mas71szWJ2eQH24KmqUoO8Clu@mail.gmail.com>
Message-ID: <4C164948.6040602@egenix.com>

Mathieu Leduc-Hamel wrote:
>>
>> Agreed, that's why I think it would be useful to simply put
>> all meta data into a SQLite database file and ship that as
>> static file as well. Local clients could then download the
>> database file (probably only a few MB) and work on it locally.
>>
> I don't think it would be easy to do that right now since the database
> store more informations than only the metadata of the all packages, you
> don't wanna give all the informations about users accounts by example...

PyPI uses PostgreSQL as database backend, so the SQLite database
file would be a (partial) copy of that database. Of course, it
would have to only contain meta-data that is also visible via
the web GUI.

> And, I don't understand how it can be perform with always updated
> informations like ones on the pypi website, your database is always updated,
> than it's not possible to have a completely updated one...

True, but I think only very few users are really after real-time data
from PyPI. Those can use the true RPC interfaces.

For the others, a static copy created and updated every 10-20 minutes
or so, is likely good enough.

Anyway, it's an idea based on the 80/20 rule :-)

> Search might not by the big deal if the data and a good cache interface is
> implemented, the number of parallel connexion is something else...
>
>> This would also make searches in PyPI a lot faster... not only
>> because searches could be done locally, but also because the
>> server wouldn't have to handle the load of those searches from
>> hundreds of clients.
>>
>>> But, definitively, yes, I'll consider that.
>>
>> Thanks !

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 14 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                34 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From chris at simplistix.co.uk  Mon Jun 14 17:27:50 2010
From: chris at simplistix.co.uk (Chris Withers)
Date: Mon, 14 Jun 2010 16:27:50 +0100
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTikt0U8YJU5eAFKysW1L14zs42LT_R1gadXjNutq@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de>	<AANLkTilvHoSVcVpExgbsOe0bqOYir4A3LZoIo794_tSV@mail.gmail.com>
	<4C12B862.9000503@v.loewis.de>	<AANLkTilC2kSUQr1Du6fdtCYaLUVmLsYbXFcgcc7Z9poP@mail.gmail.com>
	<4C12C07F.6000706@v.loewis.de>	<AANLkTils7kGRLgxdwZrtBVcqRdcJIbvcaG0ITivpVCtT@mail.gmail.com>
	<4C1338A6.1020601@v.loewis.de>	<AANLkTincczw85Mp9eOvBue3da2qGRRcRdxip4EOtJzvg@mail.gmail.com>
	<4C13450A.9050104@v.loewis.de>	<AANLkTim0twWHBk4ScJDF-0ew-8SNtAjB1xLTeOWiJQBe@mail.gmail.com>
	<AANLkTikt0U8YJU5eAFKysW1L14zs42LT_R1gadXjNutq@mail.gmail.com>
Message-ID: <4C164A76.2050006@simplistix.co.uk>

Guido van Rossum wrote:
> On Sat, Jun 12, 2010 at 12:48 PM, Justin Ryan
> <justin.ryan at reliefgarden.org> wrote:
>> Thanks, Martin, for taking the conversation offline to be a real jerk. ;)
> 
> (I won't quote more. Everyone who read it is still reeling from the
> sudden outburst.)

Sadly, it appears some people never change:

https://mail.zope.org/pipermail/zope-web/2006-October/004226.html

https://mail.zope.org/pipermail/zope-web/2006-October/date.html

cheers,

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk

From chris at simplistix.co.uk  Mon Jun 14 19:15:24 2010
From: chris at simplistix.co.uk (Chris Withers)
Date: Mon, 14 Jun 2010 18:15:24 +0100
Subject: [Catalog-sig] [Fwd: Re:  PyPI down again...]
Message-ID: <4C1663AC.6090708@simplistix.co.uk>

Apologies for forwarding this mail onto the list, please do not reply, I 
would just like this and the following message archived publicly so 
people can avoid this invdividual...

-------- Original Message --------
Subject: Re: [Catalog-sig] PyPI down again...
Date: Mon, 14 Jun 2010 09:19:45 -0700
From: Justin Ryan <justin.ryan at reliefgarden.org>
To: Chris Withers <chris at simplistix.co.uk>
CC: Guido van Rossum <guido at python.org>


Passion is better than the rampant apathy you show.  Frankly, Chris, I
believe that you may represent the sort of half-cocked volunteer that
turned Martin off to my offers of volunteering.

And, Chris, I started by standing up for you, asking why a long
standing member of this list showing concern over the constant system
failure was not being answered, was being ignored.

I'm not interested in being a part of any group or organization not
trying to GET SHIT DONE.

And that clearly includes the PSF.

If it wasn't clear to the list, you guys can make it so, that was a
farewell message from someone offering lots and lots of free time.

Anyway, you guys are all going to the killfile, so don't bother responding.

Also, grow a fucking sense of humor.  Boy was that one of my most
polite fuck-yous ever.

On Mon, Jun 14, 2010 at 8:27 AM, Chris Withers <chris at simplistix.co.uk> 
wrote:
> Guido van Rossum wrote:
>>
>> On Sat, Jun 12, 2010 at 12:48 PM, Justin Ryan
>> <justin.ryan at reliefgarden.org> wrote:
>>>
>>> Thanks, Martin, for taking the conversation offline to be a real jerk. ;)
>>
>> (I won't quote more. Everyone who read it is still reeling from the
>> sudden outburst.)
>
> Sadly, it appears some people never change:
>
> https://mail.zope.org/pipermail/zope-web/2006-October/004226.html
>
> https://mail.zope.org/pipermail/zope-web/2006-October/date.html
>
> cheers,
>
> Chris
>
> --
> Simplistix - Content Management, Batch Processing & Python Consulting
>            - http://www.simplistix.co.uk
>

______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk

From chris at simplistix.co.uk  Mon Jun 14 19:15:51 2010
From: chris at simplistix.co.uk (Chris Withers)
Date: Mon, 14 Jun 2010 18:15:51 +0100
Subject: [Catalog-sig] [Fwd: Re:  PyPI down again...]
Message-ID: <4C1663C7.3000406@simplistix.co.uk>

More, again, please do not reply...

-------- Original Message --------
Subject: Re: [Catalog-sig] PyPI down again...
Date: Mon, 14 Jun 2010 09:33:51 -0700
From: Justin Ryan <justin.ryan at reliefgarden.org>
To: Chris Withers <chris at simplistix.co.uk>
CC: Guido van Rossum <guido at python.org>

And, for what it's worth, what set me off is Martin's attitude that
"Something Else" should be fixed.

He's clearly pissy because everyone wants to rewrite the thing in
Djangass, which is Stupid(tm), and I'll grant him that, but in
engineering, when your system is fragile, it is not a strong
reflection on oneself to say:

   "Yes, but if you turned the doorknob oh so gently, it wouldn't fall off."

We don't fucking need mirrors, we fucking need to stop counting
downloads.  Apt doesn't work that way.  Yum doesn't.  Microsoft and
Apple almost definitely do.

What kind of ridiculous software distribution mechanism requires
postgres for read-only operations.

This design would not be acceptable at Google, Mr. Rossum, I know that
because I've interviewed with those fucking narcissists so many times
I now tell recruiters anyone but Google.  The characteristics of
scaling PyPI currently is like scaling AdSense.

Planning on charging per download soon?

Anyway, you guys have lost the only person with time to dedicate to
this apparently.  Go buy Martin a "World's Greatest Dad" T-Shirt and
remind him that he's important because he, periodically, for a few
minutes at a time, does things that you would never, ever bother
yourself with.

On Mon, Jun 14, 2010 at 9:19 AM, Justin Ryan
<justin.ryan at reliefgarden.org> wrote:
> Passion is better than the rampant apathy you show.  Frankly, Chris, I
> believe that you may represent the sort of half-cocked volunteer that
> turned Martin off to my offers of volunteering.
>
> And, Chris, I started by standing up for you, asking why a long
> standing member of this list showing concern over the constant system
> failure was not being answered, was being ignored.
>
> I'm not interested in being a part of any group or organization not
> trying to GET SHIT DONE.
>
> And that clearly includes the PSF.
>
> If it wasn't clear to the list, you guys can make it so, that was a
> farewell message from someone offering lots and lots of free time.
>
> Anyway, you guys are all going to the killfile, so don't bother responding.
>
> Also, grow a fucking sense of humor.  Boy was that one of my most
> polite fuck-yous ever.
>
> On Mon, Jun 14, 2010 at 8:27 AM, Chris Withers <chris at simplistix.co.uk> wrote:
>> Guido van Rossum wrote:
>>>
>>> On Sat, Jun 12, 2010 at 12:48 PM, Justin Ryan
>>> <justin.ryan at reliefgarden.org> wrote:
>>>>
>>>> Thanks, Martin, for taking the conversation offline to be a real jerk. ;)
>>>
>>> (I won't quote more. Everyone who read it is still reeling from the
>>> sudden outburst.)
>>
>> Sadly, it appears some people never change:
>>
>> https://mail.zope.org/pipermail/zope-web/2006-October/004226.html
>>
>> https://mail.zope.org/pipermail/zope-web/2006-October/date.html
>>
>> cheers,
>>
>> Chris
>>
>> --
>> Simplistix - Content Management, Batch Processing & Python Consulting
>>            - http://www.simplistix.co.uk
>>
>

______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk

From jannis at leidel.info  Mon Jun 14 19:58:38 2010
From: jannis at leidel.info (Jannis Leidel)
Date: Mon, 14 Jun 2010 19:58:38 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C12A2E4.2090305@v.loewis.de>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de>
Message-ID: <67072407-83C3-4654-B7C0-33AA3D310370@leidel.info>

Hi all,

Apologies for the late reply, I was traveling.

>> Is it possible it's time to designate a team?  I'm sure everyone
>> appreciates the hard work of a lone volunteer, but having been one
>> myself at times, the feeling that others may not do the job right is
>> often eclipsed by their availability to try.
> 
> Help is certainly appreciated. The type of help depends on the volunteer, of course. E.g. I wouldn't want to give root accounts to
> the first person that comes along and asks for them (except when the first person is Jannis Leidel, who (I believe) did the Apache restart
> today).

Yes, I restarted Apache after getting a failure report on IRC. I'll look into the reasons later today.

Jannis

From mal at egenix.com  Tue Jun 15 13:49:03 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 15 Jun 2010 13:49:03 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
Message-ID: <4C1768AF.9040606@egenix.com>

As mentioned, I've been working on a proposal text for the cloud idea.
Here's a first draft. Please have a look and let me know whether I've
missed any important facts. Thanks.

I intend to post the proposal to the PSF board (of which I'm a member,
in case you shouldn't know) and to have it vote on the proposal in one
of the next board meetings.

"""
PSF-Proposal: 100
Title: Move PyPI static data to the cloud for better availability
Version: Draft 1
Last-Modified: 2010-06-15
Author: mal at lemburg.com (Marc-Andr? Lemburg)
Discussions-To: catalog-sig at python.org
Status: Draft
Type: Informational
Created: 2010-06-14
Post-History:


Proposal: Move PyPI static data to the cloud for better availability
========================================================================

Motivation
----------

PyPI has in recent months seen several outages with the index not
being unavailable to both users using the web GUI interface as well as
package administration tools such as easy_install from setuptools.

As more and more Python applications rely on tools such as
easy_install for direct installation, or zc.buildout to manage the
complete software configuration cycle, the PyPI infrastructure
receives more and more attention from the Python community.

In order to maintain its credibility as software repository, to
support the many different projects relying on the PyPI infrastructure
and the many users who rely on the simplified installation process
enabled by PyPI, the PSF needs to take action and move the essential
parts of PyPI to a more robust infrastructur that provides:

 * scalability
 * 24/7 system administration management
 * geo-localized fast and reliable access


Current Situation
-----------------

PyPI is currently run from a single server hosted in The Netherlands
(ximinez.python.org).  This server is run by a very small team of sys
admin.

PyPI itself has in recent months been mostly maintained by one
developer: Martin von Loewis.  Projects are underway to enhance PyPI
in various ways, including a proposal to add external mirroring (PEP
381), but these are all far from being finalized or implemented.


Usage
-----

PyPI provides four different mechanisms for accessing the stored
information:

 * a web GUI that is meant for use by humans
 * an RPC interface which is mostly used for uploading new
   content
 * a semi-static /simple package listing, used by setuptools
 * a static area /packages for package download files and
   documentation, used by both the web GUI and setuptools

The /simple package listing is dump of all packages in PyPI using a
simple HTML page with links to sub-pages for each package. These
sub-pages provide links to download files and external references.

External tools like easy_install only use the /simple package
listing together with the hosted package download files.

While the /simple package listing is currently dynamically created
from the database in real-time, this is not really needed for normal
operation. A static copy created every 10-20 minutes would provide the
same level of service in much the same way.


Moving static data to a CDN
---------------------------

Under the proposal the static information stored in PyPI
(meta-information as well as package download files and documentation)
is moved to a content delivery network (CDN).

For this purpose, the /simple package listing is replaced with a
static copy that is recreated every 10-20 minutes using a cronjob on
the PyPI server.

At the same intervals, another script will scan the package and
documentation files under /packages for updates and upload any changes
to the CDN for neartime availability.

By using a CDN the PSF will enable and provide:

 * high availability of the static PyPI content
 * offload management to the CDN
 * enable geo-localized downloads, i.e. the files are hosted
   on a nearby server
 * faster downloads
 * more reliability and scalability
 * move away from a single point of failure setup

Note that the proposal does not cover distribution of the dynamic
parts of PyPI. As a result uploads to PyPI may still fail if the PyPI
server goes down. However, these dynamic parts are currently not being
used by the existing package installation tools.


Choice of CDN: Amazon Cloudfront
--------------------------------

To keep the costs low for the PSF, Amazon Cloudfront appears to be
the bext choice for CDN.

Cloudfront is supported by a set of Python libraries (e.g. Amazon S3
lib and boto), upload scripts are readily available and can easily be
customized.

 http://www.saltycrane.com/blog/2008/12/card-store-project-4-notes-using-amazons-cloudfront/

Other CDNs, such as Akamai, are either more expensive or require
custom integration.  Availability of Python-based tools is not always
given, in fact, accessing such information is difficult for most of
the proporietary CDNs.


Cloudfront: quality of service
------------------------------

Amazon Cloudfront uses S3 as basis for the service, S3 has been around
for years and has a very stable uptime:

 http://www.readwriteweb.com/archives/amazon_s3_exceeds_9999_percent_uptime.php

Cloudfront itself has been around since Nov 2008.

You can check their current online status using this panel:

 http://status.aws.amazon.com/

Apart from the gained availability and outsourced management, we'd
also get faster downloads in most parts of the world, due to the local
caching Cloudfront is applying. This caching can be used to further
increase the availability, since we can control the expiry time of
those local copies.

So in summary we are replacing a single point of failure with N points
of failure (with N being the number of edge caching servers they use).


How Cloudfront works
--------------------

Cloudfront uses Amazon's S3 storage system which is based on
"buckets".  These can store any number of files in a directory-like
structure. The only limit is a 5GB per file limit - more than enough
for any PyPI package file.

Cloudfront provides a domain for each registered S3 bucket via a
"distribution" which is then made available through local cache
servers in various locations around the world. The management of which
server to use for an incoming request is transparently handled by
Amazon. Once uploaded to the S3 bucket, the files will be distributed
to the cache servers on demand and as necessary.

Each edge server server maintains a cache of requested files and
refetches the files after an expiry time which can be defined when
uploading the file to the bucket.

To simplify things on our side, we'll setup a CNAME DNS alias
for the Cloudfront domain issued by Amazon to our bucket:

 pypi-static.python.org. IN CNAME d32z1yuk7jeryy.cloudfront.net.

For more details, please see the Cloudfront documentation:

 http://aws.amazon.com/documentation/cloudfront/


Integration
-----------

In order to keep the number of changes to existing client side tools
and PyPI itself to a minimum, the installation will try to be as
transparent to both the server and the client side as possible.

This requires on the server side:

 * few, if any changes to the PyPI code base
 * simple scripts, driven by cronjobs
 * a simple distributed redirection setup to avoid having
   to change client side tools

On the client side:

 * no need to change the existing URL http://pypi.python.org/simple
   to access PyPI
 * redirects are already supported by setuptools via urllib2


Server side: upload cronjobs
----------------------------

Since the /simple index tree is currently being created dynamically,
we'd need to create static copies of it at regular intervals in order
to upload the content to the S3 bucket. This can easily be done using
tools such as wget or curl.

Both the static copy of the /simple tree and the static files uploaded
to /packages then need to be uploaded or updated in the S3 bucket by a
cronjob running every 10-20 minutes.


Server side: downloads statistics
---------------------------------

The next step would then be to configure access logs:

 http://docs.amazonwebservices.com/AmazonCloudFront/latest/DeveloperGuide/index.html?AccessLogs.html

and add a cronjob to download them to the PyPI server.

Since the format is a bit different than the Apache log format used by
the PyPI software, we'd have two options:

 1. convert the Cloudfront format to Apache format and simply
    append the converted logs to the local log files

 2. write a Cloudfront log file reader and add it to the
    apache_count_dist.py script that updates the download
    counts on the web GUI

Both options require no more than a few hours to implement and test.


Server side: redirection setup
------------------------------

Since PyPI wasn't designed to be put on a CDN, it mixes static file
URL paths with dynamic access ones, e.g.

dynamic:

 http://pypi.python.org/pypi
 (and a few others)

static:

 http://pypi.python.org/simple
 http://pypi.python.org/packages

To move part of the URL path tree to a CDN, which works based on
domains, we will need to provide a URL redirection setup that
redirects client side tools to the new location.

As Martin von Loewis mentioned, this will require distributing the
redirection setup to more than just one server as well.

Fortunately, this is not difficult to do: it requires a preconfigured
lighttpd (*) setup running on N different servers which then all
provide the necessary redirections (and nothing more):

dynamic:

 http://pypi.python.org/ -> http://ximinez.python.org/pypi
 http://pypi.python.org/pypi -> http://ximinez.python.org/pypi
 (and possibly a few others)

static:

 http://pypi.python.org/simple -> http://pypi-static.python.org/simple
 http://pypi.python.org/packages -> http://pypi-static.python.org/packages
 http://pypi.python.org/documentation -> http://pypi-static.python.org/documentation
 (note: pypi-static.python.org is a CNAME alias for the Cloudfront
  domain issued to the S3 bucket where we upload the data)

The pypi.python.org domain would then have to be setup to map to
multiple IP addresses via DNS round-robin, one entry for each
redirection server, e.g.

 pypi.python.org. IN A 123.123.123.1
 pypi.python.org. IN A 123.123.123.1
 pypi.python.org. IN A 123.123.123.3
 pypi.python.org. IN A 123.123.123.4

Redirection servers could be run on all PSF server machines, and, to
increase availability, on PSF partner servers as well.

(*) lighttpd is a lightwheight and fast HTTP server. It's easy to
setup, doesn't require a lot of resources on the server machine and
runs stable.


Long-term changes
-----------------

While enabling the above redirection setup, we should also start
working on changing PyPI and the client tools to use two new domains
which then cleanly separate the static CDN file access from the
dynamic PyPI server access:

 pypi.python.org
 pypi-static.python.org

Such a transition on the client side is expected to take at least a
few years. After that, the redirection service can be shut down or
used to distribute and scale the dynamic PyPI service parts.


Side-effects
------------

Restarts of the PyPI server, network outages, or hardware failures
would not affect the static copies of the PyPI on the CDN. setuptools,
easy_install, pip, zc.buildout, etc. would continue to work.

The S3 bucket would serve as additional backup for the files on PyPI.

Later intergration with Amazon EC2 (their virtual server offering)
would easily be possible for more scalability and reduced system
administration load.


Costs
-----

Amazon charges for S3 and Cloudfront storage, transfer and access. The
costs vary depending on location.

 http://aws.amazon.com/cloudfront/#pricing
 http://aws.amazon.com/s3/#pricing

To get an idea of the costs, we'd have to take a closer look at
the PyPI web stats:

 http://pypi.python.org/webstats/usage_201005.html

In May 2010, PyPI transferred 819GB data and had to handle 22mio
requests.

Using the AWS monthly calculator this gives roughly (I used 37KB as
average object size and 35% US, 35% EU, 10% HK, 10% JP as basis): USD
132 per month, or about USD 1,600 per year.

Refinancing the costs
---------------------

Since PyPI is being used as essential resource by many important
Python projects (Zope, Plone, Django, etc.), it's fair to ask the
respective foundations and the general Python community for donations
to help refinance the administration costs.

A prominent donation button should go the PyPI page with a text
explaining how PyPI is being hosted and why donations are necessary.

We may also be able to directly ask for donations from the above
foundations. Details of this are currently being evaluated by the PSF
board (there are some issues related to our non-profit status that
make this more complicated than it appears at first).


Effort
------

Given that most of the tools are readily available, setting up the
servers shouldn't take more than 2-3 developer days for developers
who've worked with Amazon S3 and Cloudfront before, including testing.

It is expected that we'll find volunteers to implement the necessary
changes.

"""

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 15 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                33 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From mal at egenix.com  Tue Jun 15 14:02:28 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 15 Jun 2010 14:02:28 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTineOzOP7Bb9JiAVPulkExOOyLjP8yHTLzmJMPkc@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>	<4C12A2E4.2090305@v.loewis.de>
	<4C12A54D.1070406@egenix.com>	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>	<4C14D8E8.4010903@egenix.com>	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>	<4C15F5F3.40501@egenix.com>
	<AANLkTineOzOP7Bb9JiAVPulkExOOyLjP8yHTLzmJMPkc@mail.gmail.com>
Message-ID: <4C176BD4.3080909@egenix.com>

Mathieu Leduc-Hamel wrote:
> To continue the discussion about a rewrite or a cleanup of the Pypi
> codebase, I'm from Montreal-Python usergroup and I'm say that yes at the
> first the current codebase of pypi seem to be very unclear and difficult to
> maintain.
> 
> But it's not an impossible mission and we are currently in the process of:
> 
> - Adding functional test. The test coverage is now around 40% percent.
> - When we'll reach a more complete coverage, we want to replace the psycopg
> api by SQLAlchemy
> - Replace many manual manipulation of the metadata by a more robust and
> straightforward way of dealing with (distutils2 might be the option there)
> 
> At first I was thinking about rewriting everything using the chishop project
> (an implementation of PyPi using django). But having the control of the code
> source and not dependent of any framework is maybe a better idea.
> 
> More than, despite the frequent outage, pypi is working today, then just a
> modernization of code base seem to be best idea.
> 
> By the wat, after a code review of tarek, a very useful thing might be to
> find a better way to deal and implement contributions coming from community.
> Right now Tarek is responsible of making the link between our effert and the
> work of Martin but we don't have any official public mirror of the source
> code and any roadmap.

You should be able to get access to the Python sandbox repository and
add your project there:

http://svn.python.org/projects/sandbox/trunk/

If that's not an option, I'd suggest you have a look at one of the
other public repo sites such as launchpad.

Note that working on PyPI needs a somewhat different development
approach since any changes will be run on a live system.

In my experience the best way to do this is by gradually changing things
(rather than introduce big structural changes such as using SA
instead of a native adapter) and keeping a close eye on the log
files for any problems.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 15 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                33 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From mcrute at gmail.com  Tue Jun 15 14:20:43 2010
From: mcrute at gmail.com (Michael Crute)
Date: Tue, 15 Jun 2010 08:20:43 -0400
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <4C1768AF.9040606@egenix.com>
References: <4C1768AF.9040606@egenix.com>
Message-ID: <AANLkTikp7gTnJAXxCrXbwxf0C4Lz2h4ka8Udlw0r4tgO@mail.gmail.com>

On Tue, Jun 15, 2010 at 7:49 AM, M.-A. Lemburg <mal at egenix.com> wrote:
> As mentioned, I've been working on a proposal text for the cloud idea.
> Here's a first draft. Please have a look and let me know whether I've
> missed any important facts. Thanks.

What about a set of volunteer mirrors of PyPi similar to the way CPAN
and Linux distributions handle this problem. pypi.python.org? That
approach eliminates any cost for the PSF and might ultimately result
in better reliability. With the volunteer mirror system you would
still statically generate the files and just make them available for
rsync then setup a page to allow mirrors to register (see CPAN). If
you take this approach I would be happy to donate a mirror to the
pool.

-- 
Michael E. Crute
http://mike.crute.org

It is a mistake to think you can solve any major problem just with
potatoes. --Douglas Adams

From marrakis at gmail.com  Tue Jun 15 14:27:13 2010
From: marrakis at gmail.com (Mathieu Leduc-Hamel)
Date: Tue, 15 Jun 2010 14:27:13 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C176BD4.3080909@egenix.com>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de> <4C12A54D.1070406@egenix.com>
	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>
	<4C14D8E8.4010903@egenix.com>
	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>
	<4C15F5F3.40501@egenix.com>
	<AANLkTineOzOP7Bb9JiAVPulkExOOyLjP8yHTLzmJMPkc@mail.gmail.com>
	<4C176BD4.3080909@egenix.com>
Message-ID: <AANLkTilsh8ymbJ6JRzF3QxMe0Rv9K-x3_WKZ-wG4_uht@mail.gmail.com>

Hi Martin,


> You should be able to get access to the Python sandbox repository and
> add your project there:
>
> http://svn.python.org/projects/sandbox/trunk/
>
> If that's not an option, I'd suggest you have a look at one of the
> other public repo sites such as launchpad.
>

Right now I'm working with Tarek Ziade on a clone of the PyPi repository
sourcecode on bitbucket, that way, it allowed tarek to keep an eye the
modifications I made on the source code since double checking any changes is
very important, as you said, for this type of project.


>
> Note that working on PyPI needs a somewhat different development
> approach since any changes will be run on a live system.
>
> In my experience the best way to do this is by gradually changing things
> (rather than introduce big structural changes such as using SA
> instead of a native adapter) and keeping a close eye on the log
> files for any problems.
>
>
That's why I was working to implement a better unit testing coverage. I
would like to modernize a little bit the source code of pypi cause i think
in the future there will some major structural changes of the code. Having a
great test coverage will allow us to change the code and be less afraid of
making mistakes.

You know implementing SA is one of the many goal I would like to achieve,
but I think the structural change you were proposing might need too some
major changes to code base if we want to it properly.

Maybe it would be easier to switch to the official mercurial repository (
hg.python.org), it would allow a better collaboration between everybody who
would like to contribute.

And if you want to see the changes I'll proposed, you could see it at:

http://bitbucket.org/mtlpython/pypi

(it will be merge in the tarek's repos soon)





> --

Marc-Andre Lemburg
> eGenix.com
>
> Professional Python Services directly from the Source  (#1, Jun 15 2010)
> >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
> >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
> >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
> ________________________________________________________________________
> 2010-07-19: EuroPython 2010, Birmingham, UK                33 days to go
>
> ::: Try our new mxODBC.Connect Python Database Interface for free ! ::::
>
>
>   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
>    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
>           Registered at Amtsgericht Duesseldorf: HRB 46611
>               http://www.egenix.com/company/contact/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100615/6ca56735/attachment.html>

From ben+python at benfinney.id.au  Tue Jun 15 14:44:36 2010
From: ben+python at benfinney.id.au (Ben Finney)
Date: Tue, 15 Jun 2010 22:44:36 +1000
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
References: <4C1768AF.9040606@egenix.com>
	<AANLkTikp7gTnJAXxCrXbwxf0C4Lz2h4ka8Udlw0r4tgO@mail.gmail.com>
Message-ID: <871vc81n4b.fsf@benfinney.id.au>

Michael Crute <mcrute at gmail.com> writes:

> On Tue, Jun 15, 2010 at 7:49 AM, M.-A. Lemburg <mal at egenix.com> wrote:
> > As mentioned, I've been working on a proposal text for the cloud
> > idea. Here's a first draft. Please have a look and let me know
> > whether I've missed any important facts. Thanks.

If ?the cloud? in this proposal means ?some single organisation or
individual?, I don't think the situation is thereby improved much.

> What about a set of volunteer mirrors of PyPi similar to the way CPAN
> and Linux distributions handle this problem. pypi.python.org? That
> approach eliminates any cost for the PSF and might ultimately result
> in better reliability.

+1.

A distributed system of mirrors administrated by disparate organisations
and/or individuals also greatly reduces the reliance on any individual
or organisation, helping reduce the inherent risks of both conflict of
interest and single-point-of-failure.

-- 
 \          ?Rightful liberty is unobstructed action, according to our |
  `\        will, within limits drawn around us by the equal rights of |
_o__)                                       others.? ?Thomas Jefferson |
Ben Finney


From fuzzyman at voidspace.org.uk  Tue Jun 15 15:18:27 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Tue, 15 Jun 2010 14:18:27 +0100
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <AANLkTikp7gTnJAXxCrXbwxf0C4Lz2h4ka8Udlw0r4tgO@mail.gmail.com>
References: <4C1768AF.9040606@egenix.com>
	<AANLkTikp7gTnJAXxCrXbwxf0C4Lz2h4ka8Udlw0r4tgO@mail.gmail.com>
Message-ID: <AANLkTinVBjigkfBSowzhSgWPvc1AAskseSSKZSntaviA@mail.gmail.com>

On 15 June 2010 13:20, Michael Crute <mcrute at gmail.com> wrote:

> On Tue, Jun 15, 2010 at 7:49 AM, M.-A. Lemburg <mal at egenix.com> wrote:
> > As mentioned, I've been working on a proposal text for the cloud idea.
> > Here's a first draft. Please have a look and let me know whether I've
> > missed any important facts. Thanks.
>
> What about a set of volunteer mirrors of PyPi similar to the way CPAN
> and Linux distributions handle this problem. pypi.python.org? That
> approach eliminates any cost for the PSF and might ultimately result
> in better reliability. With the volunteer mirror system you would
> still statically generate the files and just make them available for
> rsync then setup a page to allow mirrors to register (see CPAN). If
> you take this approach I would be happy to donate a mirror to the
> pool.
>
>

>From the document:

"Projects are underway to enhance PyPI
in various ways, including a proposal to add external mirroring (PEP
381), but these are all far from being finalized or implemented."

Just saying "mirroring" is not a solution in itself - that also takes time
and effort.

Michael



> --
> Michael E. Crute
> http://mike.crute.org
>
> It is a mistake to think you can solve any major problem just with
> potatoes. --Douglas Adams
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>



-- 
http://www.voidspace.org.uk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100615/4e07251c/attachment.html>

From ametaireau at gmail.com  Tue Jun 15 15:48:23 2010
From: ametaireau at gmail.com (=?UTF-8?Q?Alexis_M=C3=A9taireau?=)
Date: Tue, 15 Jun 2010 15:48:23 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <4C1768AF.9040606@egenix.com>
References: <4C1768AF.9040606@egenix.com>
Message-ID: <AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>

Hello,

Firstly, as Tarek said in another thread, I'm afraid this kill the PEP381
about making a mirroring infrastructure.
Having a infrastructure hosted on a cloud platform may be confortable, and
probably needed to have a 24/7 running system, but
we need to take care of letting possible the creation of new public mirrors,
outside from the Amazon (or whatever) cloud infrastructure.

On Tue, Jun 15, 2010 at 1:49 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>
> PyPI is currently run from a single server hosted in The Netherlands
> (ximinez.python.org).  This server is run by a very small team of sys
> admin.
>

As Martin von L?wis said, this already exists. "a.mirrors.pypi.python.org
 and b.mirrors.pypi.python.org are already there and could be used by
clients". Maybe Martin can you explain us (apologies if this is already done
somewhere) how things are working from now ? Is this possible to rely on the
existing work rather than using a cloud system ? What's the in place
infrastructure ?

Alexis
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100615/a2c3dd0b/attachment-0001.html>

From steve at pearwood.info  Tue Jun 15 16:33:45 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 16 Jun 2010 00:33:45 +1000
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <4C1768AF.9040606@egenix.com>
References: <4C1768AF.9040606@egenix.com>
Message-ID: <201006160033.46095.steve@pearwood.info>

On Tue, 15 Jun 2010 09:49:03 pm M.-A. Lemburg wrote:
> As mentioned, I've been working on a proposal text for the cloud
> idea. Here's a first draft. Please have a look and let me know
> whether I've missed any important facts. Thanks.

I think the most important missed fact is, just how unreliable is PyPI 
currently? Does anyone know?

I know there's a number of people complaining that it's down "all the 
time", or even occasionally, but I think that we need to know the 
magnitude of the problem that needs solving. What's the average length 
of time between outages? What's the average length of the outage? Just 
saying that there's been several outages in recent months is awfully 
hand-wavy.


[...]
> Amazon Cloudfront uses S3 as basis for the service, S3 has been
> around for years and has a very stable uptime:
>
> http://www.readwriteweb.com/archives/amazon_s3_exceeds_9999_percent_u
>ptime.php

Is there anyone here who has personal experience with Cloudfront and is 
willing to vouch for it? Or argue against it? We can only go so far 
based on Amazon's marketing material.


One thing that does worry me:

> So in summary we are replacing a single point of failure with N
> points of failure (with N being the number of edge caching servers
> they use).

I don't think this means what you seem to think it means. If you replace 
a single point of failure with N points of failure, your overall 
reliability goes down, not up, since there are now more things to go 
wrong. Assuming that they're independent points of failure, that means 
your total number of failures will increase by a factor of N.

For example, if a single edge server in (say) Australia goes down, 
Amazon might not count it as an outage for the purpose of calculating 
their 99.99% reliability since the system as a whole is still up, but 
conceivably Australian users might see an outage (or at least a 
slow-down). With N servers, I'd expect N times the number of individual 
outages, with Amazon presumably only counting it as "system down" if 
all N servers go down at the same time.




-- 
Steven D'Aprano

From marrakis at gmail.com  Tue Jun 15 16:42:53 2010
From: marrakis at gmail.com (Mathieu Leduc-Hamel)
Date: Tue, 15 Jun 2010 16:42:53 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <201006160033.46095.steve@pearwood.info>
References: <4C1768AF.9040606@egenix.com>
	<201006160033.46095.steve@pearwood.info>
Message-ID: <AANLkTikZVzAgxPD1F3Tmo6hqLNZWYrqzC5COQhkI9H81@mail.gmail.com>

>
> I think the most important missed fact is, just how unreliable is PyPI
> currently? Does anyone know?
>

Exactly my point, right now, since the code is not completely clear  and not
tested we don't really know what's supposed to worked and how.

It's really a problem when the only way you have to know if something goes
wrong is when your users start complaining...


> I don't think this means what you seem to think it means. If you replace
> a single point of failure with N points of failure, your overall
> reliability goes down, not up, since there are now more things to go
> wrong. Assuming that they're independent points of failure, that means
> your total number of failures will increase by a factor of N.
>
>
This is why we should work on the heart the problem problem, pypi itself and
why it's down sometime.

Nobody know exactly what happen, maybe it's not a performance problems.

As you said, we may have the same problem in the future on all mirroring
nodes ...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100615/25910f2a/attachment.html>

From mal at egenix.com  Tue Jun 15 17:55:30 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 15 Jun 2010 17:55:30 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <201006160033.46095.steve@pearwood.info>
References: <4C1768AF.9040606@egenix.com>
	<201006160033.46095.steve@pearwood.info>
Message-ID: <4C17A272.9070808@egenix.com>

Steven D'Aprano wrote:
> On Tue, 15 Jun 2010 09:49:03 pm M.-A. Lemburg wrote:
>> As mentioned, I've been working on a proposal text for the cloud
>> idea. Here's a first draft. Please have a look and let me know
>> whether I've missed any important facts. Thanks.
> 
> I think the most important missed fact is, just how unreliable is PyPI 
> currently? Does anyone know?
> 
> I know there's a number of people complaining that it's down "all the 
> time", or even occasionally, but I think that we need to know the 
> magnitude of the problem that needs solving. What's the average length 
> of time between outages? What's the average length of the outage? Just 
> saying that there's been several outages in recent months is awfully 
> hand-wavy.

I'm sorry, but I can't provide any numbers since there doesn't
appear to be any monitoring in place to pull those numbers from.

What I can say is that from reading the various mailing lists,
PyPI is down often enough to let people start discussions about
it and that's the point I want to address:

"""
In order to maintain its credibility as software repository, to
support the many different projects relying on the PyPI infrastructure
and the many users who rely on the simplified installation process
enabled by PyPI, the PSF needs to take action and move the essential
parts of PyPI to a more robust infrastructur that provides:

 * scalability
 * 24/7 system administration management
 * geo-localized fast and reliable access
"""

Setting up some Zenoss or Nagios monitoring system to take
care of monitoring the PyPI server (and our other servers)
would be a separate project.

> [...]
>> Amazon Cloudfront uses S3 as basis for the service, S3 has been
>> around for years and has a very stable uptime:
>>
>> http://www.readwriteweb.com/archives/amazon_s3_exceeds_9999_percent_u
>> ptime.php
> 
> Is there anyone here who has personal experience with Cloudfront and is 
> willing to vouch for it? Or argue against it? We can only go so far 
> based on Amazon's marketing material.

I don't have personal experience with Cloudfront, but
have advised companies to use Amazon EC2 and S3 as disaster
recovery and backup solution. So far, none of them has
ever complained.

While doing research for the proposal, I've read a lot
of posts about people using Amazon S3 and Cloudfront. The
overall feedback is very positive.

If things still don't work out for us, we can always go back
to the single server setup. The proposal doesn't bind us
to Cloudfront or the CDN setup in any way.

> One thing that does worry me:
> 
>> So in summary we are replacing a single point of failure with N
>> points of failure (with N being the number of edge caching servers
>> they use).
> 
> I don't think this means what you seem to think it means. If you replace
> a single point of failure with N points of failure, your overall 
> reliability goes down, not up, since there are now more things to go 
> wrong. Assuming that they're independent points of failure, that means 
> your total number of failures will increase by a factor of N.
> 
> For example, if a single edge server in (say) Australia goes down, 
> Amazon might not count it as an outage for the purpose of calculating 
> their 99.99% reliability since the system as a whole is still up, but 
> conceivably Australian users might see an outage (or at least a 
> slow-down). With N servers, I'd expect N times the number of individual 
> outages, with Amazon presumably only counting it as "system down" if 
> all N servers go down at the same time.

It's poor wording, I agree. Thanks for pointing this out.
The math is correct, though, I believe...

Let's say all servers have a probability of being
unavailable of P("Server down") = q (with q in [0,1]).

Let's further assume that all servers are independent of
each other.

The probability of none of the servers being available then is
P("System down") = q^N <= q

Cloudfront uses a DNS round-robin system with a TTL of 60 seconds,
and returns more than just one cache server per edge node, e.g.
in Germany I get 8 cache servers:

> dig d1ylr6sba64qi3.cloudfront.net

;; ANSWER SECTION:
d1ylr6sba64qi3.cloudfront.net. 57 IN    CNAME   d1ylr6sba64qi3.ams1.cloudfront.net.
d1ylr6sba64qi3.ams1.cloudfront.net. 57 IN A     216.137.59.184
d1ylr6sba64qi3.ams1.cloudfront.net. 57 IN A     216.137.59.250
d1ylr6sba64qi3.ams1.cloudfront.net. 57 IN A     216.137.59.84
d1ylr6sba64qi3.ams1.cloudfront.net. 57 IN A     216.137.59.106
d1ylr6sba64qi3.ams1.cloudfront.net. 57 IN A     216.137.59.15
d1ylr6sba64qi3.ams1.cloudfront.net. 57 IN A     216.137.59.102
d1ylr6sba64qi3.ams1.cloudfront.net. 57 IN A     216.137.59.40
d1ylr6sba64qi3.ams1.cloudfront.net. 57 IN A     216.137.59.118

;; AUTHORITY SECTION:
ams1.cloudfront.net.    141251  IN      NS      ns-ams1-01.cloudfront.net.
ams1.cloudfront.net.    141251  IN      NS      ns-ams1-02.cloudfront.net.

The probability of all 8 server being down is
P("Edge node down") = q^8 <= q

Assuming that Amazon's system monitoring is fast enough to detect
the edge node down state, it will likely switch me over to a
different edge within those 60 seconds, where I'll see another
8 or so servers:

P("2 edge nodes unavailable") = q^8 * q^8 = q^16

and so on.

Now compare all this to the probability of the single
PyPI server being down:

P("PyPI server down") = q >> q^N = P("Cloudfront down")

In other words, the probability for PyPI on the CDN being
unreachable for more than say 5 minutes (assuming the switchover
to all edge nodes takes at most 5 minutes), is q^N.

In numbers:

Let's assume that q=0.01, ie. 99% uptime, with N=32 (the true
number is likely higher):

P("PyPI server down") = 0.01 >> P("Cloudfront down") = 0.01^32 = 1e-64

Of course, you'd have to add an offset of the Amazon infrastructure
or network connectivity being down, human error, inherent system
failures and DDoS attacks, so the actual numbers are higher.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 15 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                33 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From mal at egenix.com  Tue Jun 15 18:02:33 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 15 Jun 2010 18:02:33 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>
References: <4C1768AF.9040606@egenix.com>
	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>
Message-ID: <4C17A419.4060602@egenix.com>

Alexis M?taireau wrote:
> Hello,
> 
> Firstly, as Tarek said in another thread, I'm afraid this kill the PEP381
> about making a mirroring infrastructure.
> Having a infrastructure hosted on a cloud platform may be confortable, and
> probably needed to have a 24/7 running system, but
> we need to take care of letting possible the creation of new public mirrors,
> outside from the Amazon (or whatever) cloud infrastructure.

The proposal doesn't prevent that. However, please note that
setting up public mirrors not under PSF control has its own
set of (legal) problems, which the PSF hosted cloud setup avoids.

> On Tue, Jun 15, 2010 at 1:49 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>>
>> PyPI is currently run from a single server hosted in The Netherlands
>> (ximinez.python.org).  This server is run by a very small team of sys
>> admin.
>>
> 
> As Martin von L?wis said, this already exists. "a.mirrors.pypi.python.org
>  and b.mirrors.pypi.python.org are already there and could be used by
> clients". Maybe Martin can you explain us (apologies if this is already done
> somewhere) how things are working from now ? Is this possible to rely on the
> existing work rather than using a cloud system ? What's the in place
> infrastructure ?

In order to use those two servers, you'd still need to implement
the redirection changes or client side tool changes and, what's
more important, you'd need to administer and monitor those servers
24/7 to achieve similar uptime.

The latter is what the proposal is all about: we're outsourcing
the administration and monitoring to a service provider.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 15 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                33 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From mal at egenix.com  Tue Jun 15 18:10:31 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 15 Jun 2010 18:10:31 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <AANLkTikp7gTnJAXxCrXbwxf0C4Lz2h4ka8Udlw0r4tgO@mail.gmail.com>
References: <4C1768AF.9040606@egenix.com>
	<AANLkTikp7gTnJAXxCrXbwxf0C4Lz2h4ka8Udlw0r4tgO@mail.gmail.com>
Message-ID: <4C17A5F7.7080808@egenix.com>

Michael Crute wrote:
> On Tue, Jun 15, 2010 at 7:49 AM, M.-A. Lemburg <mal at egenix.com> wrote:
>> As mentioned, I've been working on a proposal text for the cloud idea.
>> Here's a first draft. Please have a look and let me know whether I've
>> missed any important facts. Thanks.
> 
> What about a set of volunteer mirrors of PyPi similar to the way CPAN
> and Linux distributions handle this problem. pypi.python.org? That
> approach eliminates any cost for the PSF and might ultimately result
> in better reliability. With the volunteer mirror system you would
> still statically generate the files and just make them available for
> rsync then setup a page to allow mirrors to register (see CPAN). If
> you take this approach I would be happy to donate a mirror to the
> pool.

Thanks for the offer.

Setting up such a network based on PSF partner organizations (to avoid
the legal problems) would work indeed, but it would both take
longer to setup and require more work on the administration side.

I still think that the cloud proposal is more cost effective
and faster to setup.

If it doesn't work out, we can always go back to such a network
of servers that we administer on our own.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 15 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                33 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From ziade.tarek at gmail.com  Tue Jun 15 19:02:05 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Tue, 15 Jun 2010 19:02:05 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <4C17A419.4060602@egenix.com>
References: <4C1768AF.9040606@egenix.com>
	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>
	<4C17A419.4060602@egenix.com>
Message-ID: <AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com>

On Tue, Jun 15, 2010 at 6:02 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> Alexis M?taireau wrote:
>> Hello,
>>
>> Firstly, as Tarek said in another thread, I'm afraid this kill the PEP381
>> about making a mirroring infrastructure.
>> Having a infrastructure hosted on a cloud platform may be confortable, and
>> probably needed to have a 24/7 running system, but
>> we need to take care of letting possible the creation of new public mirrors,
>> outside from the Amazon (or whatever) cloud infrastructure.
>
> The proposal doesn't prevent that. However, please note that
> setting up public mirrors not under PSF control has its own
> set of (legal) problems, which the PSF hosted cloud setup avoids.

Mirrors already exists out there, so unless you ban them (which would
be a really bad idea)
setting up a cloud will not fix any legal issue if you think there's a
legal issue.

In any case, you can't prevent people from creating mirrors even if you
would say its illegal. Moreover, having mirrors provided by the community
is way better than relying on one single entity (the PSF) for this.
(if we think "decentralized")

So I think it would be better to focus on PEP 381, and make those
existing mirrors comply with it. And maybe work on the legal issues
you've mentioned


>
>> On Tue, Jun 15, 2010 at 1:49 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>>>
>>> PyPI is currently run from a single server hosted in The Netherlands
>>> (ximinez.python.org). ?This server is run by a very small team of sys
>>> admin.
>>>
>>
>> As Martin von L?wis said, this already exists. "a.mirrors.pypi.python.org
>> ?and b.mirrors.pypi.python.org are already there and could be used by
>> clients". Maybe Martin can you explain us (apologies if this is already done
>> somewhere) how things are working from now ? Is this possible to rely on the
>> existing work rather than using a cloud system ? What's the in place
>> infrastructure ?
>
> In order to use those two servers, you'd still need to implement
> the redirection changes or client side tool changes and, what's
> more important, you'd need to administer and monitor those servers
> 24/7 to achieve similar uptime.

Not at all because the registered mirrors would be in the DNS round robin,
and the clients would just have to switch to another mirror if a mirror
is down. (that's explained in PEP 381)

Such a decentralized system is far more reliable than any centralized
system, and won't cost anything to the PSF.


>
> The latter is what the proposal is all about: we're outsourcing
> the administration and monitoring to a service provider.

Having a better PyPI server is of course a good idea, don't get me wrong.

But it doesn't really solve anything at this point.

A simple, documented protocol, and a list of registered mirrors
backed up by the community is the way to go imho.

And that's what unofficially happened already ! When PyPI is
down, you'll see some tweet messages saying "go to this url, it's my mirror!"

So I would trust the community and finish the PEP and provide a
library that would allow anyone to run a PEP 381-compatible mirror.

Regards
Tarek

-- 
Tarek Ziad? | http://ziade.org

From ziade.tarek at gmail.com  Tue Jun 15 19:09:30 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Tue, 15 Jun 2010 19:09:30 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTilsh8ymbJ6JRzF3QxMe0Rv9K-x3_WKZ-wG4_uht@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de> <4C12A54D.1070406@egenix.com>
	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>
	<4C14D8E8.4010903@egenix.com>
	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>
	<4C15F5F3.40501@egenix.com>
	<AANLkTineOzOP7Bb9JiAVPulkExOOyLjP8yHTLzmJMPkc@mail.gmail.com>
	<4C176BD4.3080909@egenix.com>
	<AANLkTilsh8ymbJ6JRzF3QxMe0Rv9K-x3_WKZ-wG4_uht@mail.gmail.com>
Message-ID: <AANLkTimruSoM8lIiIRfyZZdVSN0n4gjJqjj6-09KzFlC@mail.gmail.com>

On Tue, Jun 15, 2010 at 2:27 PM, Mathieu Leduc-Hamel <marrakis at gmail.com> wrote:
[..]
> Maybe it would be easier to switch to the official mercurial repository
> (hg.python.org), it would allow a better collaboration between everybody who
> would like to contribute.

Yes that's what I was proposing earlier in the thread.

Having the repo at hg.python.org would facilitate contributions. We can have a
process where they are reviewed by Martin and/or myself for example,
and pulled from anyone's clone.

I am volunteering to import it into hg.python.org, if Martin agrees
for this switch.

Regards
Tarek
-- 
Tarek Ziad? | http://ziade.org

From ronaldoussoren at mac.com  Tue Jun 15 19:15:00 2010
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Tue, 15 Jun 2010 19:15:00 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com>
References: <4C1768AF.9040606@egenix.com>
	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>
	<4C17A419.4060602@egenix.com>
	<AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com>
Message-ID: <A447F7AF-18D4-4E07-8F9B-BDA0E6BC92D2@mac.com>


On 15 Jun, 2010, at 19:02, Tarek Ziad? wrote:

> On Tue, Jun 15, 2010 at 6:02 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>> Alexis M?taireau wrote:
>>> Hello,
>>> 
>>> Firstly, as Tarek said in another thread, I'm afraid this kill the PEP381
>>> about making a mirroring infrastructure.
>>> Having a infrastructure hosted on a cloud platform may be confortable, and
>>> probably needed to have a 24/7 running system, but
>>> we need to take care of letting possible the creation of new public mirrors,
>>> outside from the Amazon (or whatever) cloud infrastructure.
>> 
>> The proposal doesn't prevent that. However, please note that
>> setting up public mirrors not under PSF control has its own
>> set of (legal) problems, which the PSF hosted cloud setup avoids.
> 
> Mirrors already exists out there, so unless you ban them (which would
> be a really bad idea)
> setting up a cloud will not fix any legal issue if you think there's a
> legal issue.
> 
> In any case, you can't prevent people from creating mirrors even if you
> would say its illegal. Moreover, having mirrors provided by the community
> is way better than relying on one single entity (the PSF) for this.
> (if we think "decentralized")

Why is having community mirrors better than one managed by the PSF?

Even with community mirrors the contents of PyPI are still controlled by the PSF, because they control the master server, there is not much decentralization in that respect. 

AFAIK the goal of this exercise is to improve the uptime of the PyPI download service as used by existing installation, MAL's proposal seems like an easy way to accomplish that with minimal effort.

Ronald
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3567 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100615/a91f16f1/attachment-0001.bin>

From jcea at jcea.es  Tue Jun 15 19:22:22 2010
From: jcea at jcea.es (Jesus Cea)
Date: Tue, 15 Jun 2010 19:22:22 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <4C1768AF.9040606@egenix.com>
References: <4C1768AF.9040606@egenix.com>
Message-ID: <4C17B6CE.20209@jcea.es>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 15/06/10 13:49, M.-A. Lemburg wrote:
> Server side: upload cronjobs
> ----------------------------
> 
> Since the /simple index tree is currently being created dynamically,
> we'd need to create static copies of it at regular intervals in order
> to upload the content to the S3 bucket. This can easily be done using
> tools such as wget or curl.
> 
> Both the static copy of the /simple tree and the static files uploaded
> to /packages then need to be uploaded or updated in the S3 bucket by a
> cronjob running every 10-20 minutes.

I don't comment about the convenience to migrate or not.

But having to wait 20 minutes to deploy my just released package to my
datacenter is a bit inconvenient to me :-).

Would be nice to change PYPI code just to dump "simple" each time the
database changes. Perusing the RSS, the load should be low and actually
less demanding to CPU and database server (if you only update "simple"
with the changes, not rebuilding everything each time).

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at jcea.es - http://www.jcea.es/     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at jabber.org         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBTBe2zplgi5GaxT1NAQLZIAP+JHe5dAVN27FTMD+gMzKntFEbEA3t9gqh
gblEFPc5bigEAvfXxJTm2p+A0meeH7dVNT2akyYU4Cn+DmdV9+LkXY1c+beV7bpY
BD2ROBvmFJ05FXPPkFD/La4Z0Bqb9JuZy7PV2kTQagzMsn3VjLJRDWt5K0kpIwcw
Fntro0K/dRs=
=G2bd
-----END PGP SIGNATURE-----

From ziade.tarek at gmail.com  Tue Jun 15 19:24:29 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Tue, 15 Jun 2010 19:24:29 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <A447F7AF-18D4-4E07-8F9B-BDA0E6BC92D2@mac.com>
References: <4C1768AF.9040606@egenix.com>
	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>
	<4C17A419.4060602@egenix.com>
	<AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com>
	<A447F7AF-18D4-4E07-8F9B-BDA0E6BC92D2@mac.com>
Message-ID: <AANLkTilm3RLO5Z5uJzCtjYUqmACLOl-jWucgQv13IicH@mail.gmail.com>

On Tue, Jun 15, 2010 at 7:15 PM, Ronald Oussoren <ronaldoussoren at mac.com> wrote:
>
> On 15 Jun, 2010, at 19:02, Tarek Ziad? wrote:
>
>> On Tue, Jun 15, 2010 at 6:02 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>>> Alexis M?taireau wrote:
>>>> Hello,
>>>>
>>>> Firstly, as Tarek said in another thread, I'm afraid this kill the PEP381
>>>> about making a mirroring infrastructure.
>>>> Having a infrastructure hosted on a cloud platform may be confortable, and
>>>> probably needed to have a 24/7 running system, but
>>>> we need to take care of letting possible the creation of new public mirrors,
>>>> outside from the Amazon (or whatever) cloud infrastructure.
>>>
>>> The proposal doesn't prevent that. However, please note that
>>> setting up public mirrors not under PSF control has its own
>>> set of (legal) problems, which the PSF hosted cloud setup avoids.
>>
>> Mirrors already exists out there, so unless you ban them (which would
>> be a really bad idea)
>> setting up a cloud will not fix any legal issue if you think there's a
>> legal issue.
>>
>> In any case, you can't prevent people from creating mirrors even if you
>> would say its illegal. Moreover, having mirrors provided by the community
>> is way better than relying on one single entity (the PSF) for this.
>> (if we think "decentralized")
>
> Why is having community mirrors better than one managed by the PSF?

Because it's not controlled anymore by one single entity. For example,
if something is broken in the system
and need a human intervention, and the sysadmin people are not
available, we get a downtime.

Lots of mirrors back by more people in the community greatly reduces
this problem

> Even with community mirrors the contents of PyPI are still controlled by the PSF, because they control the master server, there is not much decentralization in that respect.

Once the DNS is set to accept other servers, the PyPI 'main' server is
just the master that gets the content first which is then replicated.

So, yes, the PSF controls the DNS, but will not control the
downtime/uptime issues anymore.


> AFAIK the goal of this exercise is to improve the uptime of the PyPI download service as used by existing installation, MAL's proposal seems like an easy way to accomplish that with minimal effort.

Again, mirrors already exists out there. and they are getting updated
every day. We are not far from what we want. So after more thoughts, I
really don't think the cloud thing will
be a minimal effort.


>
> Ronald



-- 
Tarek Ziad? | http://ziade.org

From mal at egenix.com  Tue Jun 15 19:34:42 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 15 Jun 2010 19:34:42 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com>
References: <4C1768AF.9040606@egenix.com>	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>	<4C17A419.4060602@egenix.com>
	<AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com>
Message-ID: <4C17B9B2.10006@egenix.com>

Tarek Ziad? wrote:
> On Tue, Jun 15, 2010 at 6:02 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>> Alexis M?taireau wrote:
>>> Hello,
>>>
>>> Firstly, as Tarek said in another thread, I'm afraid this kill the PEP381
>>> about making a mirroring infrastructure.
>>> Having a infrastructure hosted on a cloud platform may be confortable, and
>>> probably needed to have a 24/7 running system, but
>>> we need to take care of letting possible the creation of new public mirrors,
>>> outside from the Amazon (or whatever) cloud infrastructure.
>>
>> The proposal doesn't prevent that. However, please note that
>> setting up public mirrors not under PSF control has its own
>> set of (legal) problems, which the PSF hosted cloud setup avoids.
> 
> Mirrors already exists out there, so unless you ban them (which would
> be a really bad idea)
> setting up a cloud will not fix any legal issue if you think there's a
> legal issue.
> 
> In any case, you can't prevent people from creating mirrors even if you
> would say its illegal. Moreover, having mirrors provided by the community
> is way better than relying on one single entity (the PSF) for this.
> (if we think "decentralized")
> 
> So I think it would be better to focus on PEP 381, and make those
> existing mirrors comply with it. And maybe work on the legal issues
> you've mentioned

That can all happen in parallel.

>>> On Tue, Jun 15, 2010 at 1:49 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>>>>
>>>> PyPI is currently run from a single server hosted in The Netherlands
>>>> (ximinez.python.org).  This server is run by a very small team of sys
>>>> admin.
>>>>
>>>
>>> As Martin von L?wis said, this already exists. "a.mirrors.pypi.python.org
>>>  and b.mirrors.pypi.python.org are already there and could be used by
>>> clients". Maybe Martin can you explain us (apologies if this is already done
>>> somewhere) how things are working from now ? Is this possible to rely on the
>>> existing work rather than using a cloud system ? What's the in place
>>> infrastructure ?
>>
>> In order to use those two servers, you'd still need to implement
>> the redirection changes or client side tool changes and, what's
>> more important, you'd need to administer and monitor those servers
>> 24/7 to achieve similar uptime.
> 
> Not at all because the registered mirrors would be in the DNS round robin,
> and the clients would just have to switch to another mirror if a mirror
> is down. (that's explained in PEP 381)

Someone would still have to provide the system administration for
those servers and also make sure that the servers do actually provide
up-to-date snapshots. DNS round-robin will help with finding the
servers, not with the other aspects.

Something the PEP should focus a bit more on is the freshness
guarantee of the mirror data. It currently puts this
important detail into the hands of the client software,
so every package tool will have to find it's own way of
determining whether to use a mirror or not.

Another important feature missing from the PEP is data consistency.
Since a client tool would only communicate with one mirror, it
will ultimately have to trust the information on that server,
including the MD5 sums. This makes it rather easy to manipulate
data on the servers (not by the admins, but by hackers manipulating
those servers).

Having digitally signed packages, like you do on many Linux repository
servers, would solve this issue, but also require a complete verification
infrastructure on the client side.

You don't need any of this with the cloud caching approach.

> Such a decentralized system is far more reliable than any centralized
> system, and won't cost anything to the PSF.

We'll see :-)

>>
>> The latter is what the proposal is all about: we're outsourcing
>> the administration and monitoring to a service provider.
> 
> Having a better PyPI server is of course a good idea, don't get me wrong.
> 
> But it doesn't really solve anything at this point.

Obviously I have a different opinion, otherwise I wouldn't have
written the proposal :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 15 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                33 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From mal at egenix.com  Tue Jun 15 19:43:31 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 15 Jun 2010 19:43:31 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <AANLkTilm3RLO5Z5uJzCtjYUqmACLOl-jWucgQv13IicH@mail.gmail.com>
References: <4C1768AF.9040606@egenix.com>	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>	<4C17A419.4060602@egenix.com>	<AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com>	<A447F7AF-18D4-4E07-8F9B-BDA0E6BC92D2@mac.com>
	<AANLkTilm3RLO5Z5uJzCtjYUqmACLOl-jWucgQv13IicH@mail.gmail.com>
Message-ID: <4C17BBC3.3050205@egenix.com>

Tarek Ziad? wrote:
> On Tue, Jun 15, 2010 at 7:15 PM, Ronald Oussoren <ronaldoussoren at mac.com> wrote:
>>
>> On 15 Jun, 2010, at 19:02, Tarek Ziad? wrote:
>>
>>> On Tue, Jun 15, 2010 at 6:02 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>>>> Alexis M?taireau wrote:
>>>>> Hello,
>>>>>
>>>>> Firstly, as Tarek said in another thread, I'm afraid this kill the PEP381
>>>>> about making a mirroring infrastructure.
>>>>> Having a infrastructure hosted on a cloud platform may be confortable, and
>>>>> probably needed to have a 24/7 running system, but
>>>>> we need to take care of letting possible the creation of new public mirrors,
>>>>> outside from the Amazon (or whatever) cloud infrastructure.
>>>>
>>>> The proposal doesn't prevent that. However, please note that
>>>> setting up public mirrors not under PSF control has its own
>>>> set of (legal) problems, which the PSF hosted cloud setup avoids.
>>>
>>> Mirrors already exists out there, so unless you ban them (which would
>>> be a really bad idea)
>>> setting up a cloud will not fix any legal issue if you think there's a
>>> legal issue.
>>>
>>> In any case, you can't prevent people from creating mirrors even if you
>>> would say its illegal. Moreover, having mirrors provided by the community
>>> is way better than relying on one single entity (the PSF) for this.
>>> (if we think "decentralized")
>>
>> Why is having community mirrors better than one managed by the PSF?
> 
> Because it's not controlled anymore by one single entity. For example,
> if something is broken in the system
> and need a human intervention, and the sysadmin people are not
> available, we get a downtime.

I'm not sure I understand: if the PyPI server goes down, the
data will still be readily available on Amazon S3 and Cloudfront
caches - the cronjobs copy over the PyPI server content to S3
and Cloudfront serves it up from there.

And if Cloudfront or S3 goes down, client tools could still
try to access the PyPI server. (I'll add a note about that to
the proposal.)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 15 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                33 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From jcea at jcea.es  Tue Jun 15 19:44:05 2010
From: jcea at jcea.es (Jesus Cea)
Date: Tue, 15 Jun 2010 19:44:05 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <AANLkTikp7gTnJAXxCrXbwxf0C4Lz2h4ka8Udlw0r4tgO@mail.gmail.com>
References: <4C1768AF.9040606@egenix.com>
	<AANLkTikp7gTnJAXxCrXbwxf0C4Lz2h4ka8Udlw0r4tgO@mail.gmail.com>
Message-ID: <4C17BBE5.4010901@jcea.es>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 15/06/10 14:20, Michael Crute wrote:
> What about a set of volunteer mirrors of PyPi similar to the way CPAN
> and Linux distributions handle this problem. pypi.python.org? That
> approach eliminates any cost for the PSF and might ultimately result
> in better reliability. With the volunteer mirror system you would
> still statically generate the files and just make them available for
> rsync then setup a page to allow mirrors to register (see CPAN). If
> you take this approach I would be happy to donate a mirror to the
> pool.

I would rather prefer this approach, actually. With the following
changes in current code:

1. setuptools & friends: Support for retrying several mirrors if first
try fails.

2. Packages MUST be digitally signed. Ideally by the owner, but at least
by PYPI central node (current pypi server). That way, a "rogue" mirror
can't distribute trojans.

3. Trusting the stats is not possible :(, if there are "rogue" mirrors.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at jcea.es - http://www.jcea.es/     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at jabber.org         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBTBe75Zlgi5GaxT1NAQLnawP+J4Cb6ywGCpIEOsD1L4mbUTfnWnh9X59T
zxTjxbEdCaZrbLgY2KuAAoAdSocmrQFhX/zfeMxEpoilnLH2mZknM+Bb6icNAzbR
JFYDmfu7QPhUjPrNgFlQhXQsuuMnpNEzTv3yINmjKZg2OYwU7BhbolFKrAGF+b+5
kKmnwWjTju0=
=rQh4
-----END PGP SIGNATURE-----

From mal at egenix.com  Tue Jun 15 19:45:28 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 15 Jun 2010 19:45:28 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <4C17B6CE.20209@jcea.es>
References: <4C1768AF.9040606@egenix.com> <4C17B6CE.20209@jcea.es>
Message-ID: <4C17BC38.6090208@egenix.com>

Jesus Cea wrote:
> On 15/06/10 13:49, M.-A. Lemburg wrote:
>> Server side: upload cronjobs
>> ----------------------------
> 
>> Since the /simple index tree is currently being created dynamically,
>> we'd need to create static copies of it at regular intervals in order
>> to upload the content to the S3 bucket. This can easily be done using
>> tools such as wget or curl.
> 
>> Both the static copy of the /simple tree and the static files uploaded
>> to /packages then need to be uploaded or updated in the S3 bucket by a
>> cronjob running every 10-20 minutes.
> 
> I don't comment about the convenience to migrate or not.
> 
> But having to wait 20 minutes to deploy my just released package to my
> datacenter is a bit inconvenient to me :-).
> 
> Would be nice to change PYPI code just to dump "simple" each time the
> database changes. Perusing the RSS, the load should be low and actually
> less demanding to CPU and database server (if you only update "simple"
> with the changes, not rebuilding everything each time).

I'll leave that for a version 2.0 of the cloud idea :-)

My main interest now is getting something done with only requiring
minimal changes to the PyPI software.

Note that with community servers that only mirror once a day,
you'd have to wait up to a whole day for your package updates
to become visible worldwide.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 15 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                33 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From jcea at jcea.es  Tue Jun 15 19:53:19 2010
From: jcea at jcea.es (Jesus Cea)
Date: Tue, 15 Jun 2010 19:53:19 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <201006160033.46095.steve@pearwood.info>
References: <4C1768AF.9040606@egenix.com>
	<201006160033.46095.steve@pearwood.info>
Message-ID: <4C17BE0F.5090509@jcea.es>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 15/06/10 16:33, Steven D'Aprano wrote:
> For example, if a single edge server in (say) Australia goes down, 
> Amazon might not count it as an outage for the purpose of calculating 
> their 99.99% reliability since the system as a whole is still up, but 
> conceivably Australian users might see an outage (or at least a 
> slow-down). With N servers, I'd expect N times the number of individual 
> outages, with Amazon presumably only counting it as "system down" if 
> all N servers go down at the same time.

I don't know, but if I were Amazon, I would (automatically) update the
DNS to serve Australia users from any other edge server :).

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at jcea.es - http://www.jcea.es/     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at jabber.org         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBTBe+D5lgi5GaxT1NAQJtfAP6Azk2UGRQS7tPpxX9AcHQA9ALRXubcoHQ
cleDsSxDe0ghoeSVtGMFJYN3KTlMknc9sPmxwBy2dR8tTlxQh0ytHQsQEqokZMsC
jAbtYcaPgVG4gPo19xHg81elTkRAVhflW7NbV8AmlEIPXsV1LP92DH5wHPMaWyws
4nynJKYCBlY=
=g4k1
-----END PGP SIGNATURE-----

From jcea at jcea.es  Tue Jun 15 20:21:41 2010
From: jcea at jcea.es (Jesus Cea)
Date: Tue, 15 Jun 2010 20:21:41 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <4C17BC38.6090208@egenix.com>
References: <4C1768AF.9040606@egenix.com> <4C17B6CE.20209@jcea.es>
	<4C17BC38.6090208@egenix.com>
Message-ID: <4C17C4B5.3000801@jcea.es>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 15/06/10 19:45, M.-A. Lemburg wrote:
> Note that with community servers that only mirror once a day,
> you'd have to wait up to a whole day for your package updates
> to become visible worldwide.

But TODAY mirror use is voluntary and per-user. That is, you use a
mirror because you want, not because pypi is pushing you around
transparently. I don't use mirrors so far, because pypi inestability
hasn't hit me so far, and because I don't "trust" mirrors (see next
paragraph).

I read pep 381 long time ago and I don't remember how/when a mirror
would update, but I do remember it doesn't mandate digital signatures
(signed by pypi central node, verified by setuptools&friends). That is a
big gap, in my opinion.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at jcea.es - http://www.jcea.es/     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at jabber.org         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBTBfEtZlgi5GaxT1NAQKuKAP/YUTRh9GXAlEa8X5trvnUsWmS6KRgxSIz
jxB35L9WwWKR0FMzeay1ThvOoiz5aXlrqGaBbEZiPjr3UuWMXRf+WSh2RoylEher
f5i8pxwwBwopVCKbRx07nWsroJUH9oIFYmTY/IIidqjh8UNL+FBBRCSRuFyay/H/
W/zxzjAFxuc=
=UVuI
-----END PGP SIGNATURE-----

From ziade.tarek at gmail.com  Tue Jun 15 20:38:58 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Tue, 15 Jun 2010 20:38:58 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <4C17B9B2.10006@egenix.com>
References: <4C1768AF.9040606@egenix.com>
	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>
	<4C17A419.4060602@egenix.com>
	<AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com>
	<4C17B9B2.10006@egenix.com>
Message-ID: <AANLkTik_MDSpgfgxFaEzomiDTWhcb2aBu3eUahRVf8V7@mail.gmail.com>

On Tue, Jun 15, 2010 at 7:34 PM, M.-A. Lemburg <mal at egenix.com> wrote:
[..]
>> So I think it would be better to focus on PEP 381, and make those
>> existing mirrors comply with it. And maybe work on the legal issues
>> you've mentioned
>
> That can all happen in parallel.

I really doubt it.

You have come with a cloud proposal and want it to be funded by the PSF.

Your proposal is basically a proprietary mirroring system, and it competes
with the mirroring protocol we wanted to build, based on the existing
mirrors the community has.

So far I don't see any advantage in a cloud-based mirror managed by the PSF,
compared to a round of community mirrors.

Given the lack of time and resources we had to finish the work, this
means that if your proposal
is accepted, it will be done whereas PEP 381 will stay as it is today.

So if you want this to happen in parralell, a funding should also be granted
to build the implementation of PEP 381 (in z3c.pypimirror I guess)


[..]
>> Not at all because the registered mirrors would be in the DNS round robin,
>> and the clients would just have to switch to another mirror if a mirror
>> is down. (that's explained in PEP 381)
>
> Someone would still have to provide the system administration for
> those servers and also make sure that the servers do actually provide
> up-to-date snapshots. DNS round-robin will help with finding the
> servers, not with the other aspects.
>
> Something the PEP should focus a bit more on is the freshness
> guarantee of the mirror data. It currently puts this
> important detail into the hands of the client software,
> so every package tool will have to find it's own way of
> determining whether to use a mirror or not.
>
> Another important feature missing from the PEP is data consistency.
> Since a client tool would only communicate with one mirror, it
> will ultimately have to trust the information on that server,
> including the MD5 sums. This makes it rather easy to manipulate
> data on the servers (not by the admins, but by hackers manipulating
> those servers).

Your PyPI cloud infrastructure be hacked as well.

The mirrors are trusted, because they are registered manually and
they are managed by people in the community, we trust.

[..]
>> Such a decentralized system is far more reliable than any centralized
>> system, and won't cost anything to the PSF.
>
> We'll see :-)

Hehe, not sure what you mean here. Did the PSF voted yes on your
proposal already ?  ;)


>>>
>>> The latter is what the proposal is all about: we're outsourcing
>>> the administration and monitoring to a service provider.
>>
>> Having a better PyPI server is of course a good idea, don't get me wrong.
>>
>> But it doesn't really solve anything at this point.
>
> Obviously I have a different opinion, otherwise I wouldn't have
> written the proposal :-)

Well technically the problem is already solved by the existing mirrors we have
in the community: when PyPI is down, other servers can take the relay.

I have no doubt you can enhance the PyPI main server
and make its uptime approaching 100% by putting money and time.

But having a documented protocol and a library anyone who has a spare
server can use to provide a mirror will always beat your cloud system for the
reasons I've already mentioned.

Putting all the eggs in the same basket (PSF+Amazon?) can't be as reliable
as a distributed networks of mirrors


Regards
Tarek

-- 
Tarek Ziad? | http://ziade.org

From ziade.tarek at gmail.com  Tue Jun 15 20:52:14 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Tue, 15 Jun 2010 20:52:14 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <4C17C4B5.3000801@jcea.es>
References: <4C1768AF.9040606@egenix.com> <4C17B6CE.20209@jcea.es>
	<4C17BC38.6090208@egenix.com> <4C17C4B5.3000801@jcea.es>
Message-ID: <AANLkTik-1Z6vecE5b83fKNA-qo687EiQUQDq5CIM5Oo5@mail.gmail.com>

On Tue, Jun 15, 2010 at 8:21 PM, Jesus Cea <jcea at jcea.es> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 15/06/10 19:45, M.-A. Lemburg wrote:
>> Note that with community servers that only mirror once a day,
>> you'd have to wait up to a whole day for your package updates
>> to become visible worldwide.
>
> But TODAY mirror use is voluntary and per-user. That is, you use a
> mirror because you want, not because pypi is pushing you around
> transparently. I don't use mirrors so far, because pypi inestability
> hasn't hit me so far, and because I don't "trust" mirrors (see next
> paragraph).
>
> I read pep 381 long time ago and I don't remember how/when a mirror
> would update, but I do remember it doesn't mandate digital signatures
> (signed by pypi central node, verified by setuptools&friends). That is a
> big gap, in my opinion.

You don't trust mirrors right now, but if they are listed at PyPI as
official mirrors,
that are managed by people that can be trusted as much as you can trust
the PyPI syadmin for instance, and much much more than the packages
you can download at PyPI.

Do you trust the package you are installing more than an "official"
mirror ? if so, why ?

Anyone can upload a package at PyPI with

   os.system('rm -rf /')

in its setup.py...

Regards
Tarek

-- 
Tarek Ziad? | http://ziade.org

From ziade.tarek at gmail.com  Tue Jun 15 20:47:52 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Tue, 15 Jun 2010 20:47:52 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <4C17BBC3.3050205@egenix.com>
References: <4C1768AF.9040606@egenix.com>
	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>
	<4C17A419.4060602@egenix.com>
	<AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com>
	<A447F7AF-18D4-4E07-8F9B-BDA0E6BC92D2@mac.com>
	<AANLkTilm3RLO5Z5uJzCtjYUqmACLOl-jWucgQv13IicH@mail.gmail.com>
	<4C17BBC3.3050205@egenix.com>
Message-ID: <AANLkTikXWnAlk_MVppqgCX6WhouDvqC0w8fPUJRpR6HK@mail.gmail.com>

On Tue, Jun 15, 2010 at 7:43 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> Tarek Ziad? wrote:
>> On Tue, Jun 15, 2010 at 7:15 PM, Ronald Oussoren <ronaldoussoren at mac.com> wrote:
>>>
>>> On 15 Jun, 2010, at 19:02, Tarek Ziad? wrote:
>>>
>>>> On Tue, Jun 15, 2010 at 6:02 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>>>>> Alexis M?taireau wrote:
>>>>>> Hello,
>>>>>>
>>>>>> Firstly, as Tarek said in another thread, I'm afraid this kill the PEP381
>>>>>> about making a mirroring infrastructure.
>>>>>> Having a infrastructure hosted on a cloud platform may be confortable, and
>>>>>> probably needed to have a 24/7 running system, but
>>>>>> we need to take care of letting possible the creation of new public mirrors,
>>>>>> outside from the Amazon (or whatever) cloud infrastructure.
>>>>>
>>>>> The proposal doesn't prevent that. However, please note that
>>>>> setting up public mirrors not under PSF control has its own
>>>>> set of (legal) problems, which the PSF hosted cloud setup avoids.
>>>>
>>>> Mirrors already exists out there, so unless you ban them (which would
>>>> be a really bad idea)
>>>> setting up a cloud will not fix any legal issue if you think there's a
>>>> legal issue.
>>>>
>>>> In any case, you can't prevent people from creating mirrors even if you
>>>> would say its illegal. Moreover, having mirrors provided by the community
>>>> is way better than relying on one single entity (the PSF) for this.
>>>> (if we think "decentralized")
>>>
>>> Why is having community mirrors better than one managed by the PSF?
>>
>> Because it's not controlled anymore by one single entity. For example,
>> if something is broken in the system
>> and need a human intervention, and the sysadmin people are not
>> available, we get a downtime.
>
> I'm not sure I understand: if the PyPI server goes down, the
> data will still be readily available on Amazon S3 and Cloudfront
> caches - the cronjobs copy over the PyPI server content to S3
> and Cloudfront serves it up from there.
>
> And if Cloudfront or S3 goes down, client tools could still
> try to access the PyPI server. (I'll add a note about that to
> the proposal.)

This can't beat a distributed network of mirrors that are not
depending on a single provider like Amazon.

We have suffered from this at bitbucket.org as a matter of fact:
Amazon was having problems, so bitbucket was slow and sometimes
down.

If Bitbucket had back then a distributed network of mirrors hosted
at different providers, that wouldn't have happened.

What I have learned lately in this area is that a lot of cheap servers spreaded
all over the world in different datacenters is more reliable.

And we happen to have this network already: lots of people
will host a PyPI mirror as soon as it's easy to set one imho.

Regards
Tarek

-- 
Tarek Ziad? | http://ziade.org

From martin at v.loewis.de  Tue Jun 15 21:02:45 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Tue, 15 Jun 2010 21:02:45 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTilsh8ymbJ6JRzF3QxMe0Rv9K-x3_WKZ-wG4_uht@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>	<4C12A2E4.2090305@v.loewis.de>
	<4C12A54D.1070406@egenix.com>	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>	<4C14D8E8.4010903@egenix.com>	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>	<4C15F5F3.40501@egenix.com>	<AANLkTineOzOP7Bb9JiAVPulkExOOyLjP8yHTLzmJMPkc@mail.gmail.com>	<4C176BD4.3080909@egenix.com>
	<AANLkTilsh8ymbJ6JRzF3QxMe0Rv9K-x3_WKZ-wG4_uht@mail.gmail.com>
Message-ID: <4C17CE55.5000601@v.loewis.de>

> Hi Martin,

Notice that you actually replied to Marc-Andre Lemburg.

>     You should be able to get access to the Python sandbox repository and
>     add your project there:
>
>     http://svn.python.org/projects/sandbox/trunk/
>
>     If that's not an option, I'd suggest you have a look at one of the
>     other public repo sites such as launchpad.
>
> Right now I'm working with Tarek Ziade on a clone of the PyPi repository
> sourcecode on bitbucket, that way, it allowed tarek to keep an eye the
> modifications I made on the source code since double checking any
> changes is very important, as you said, for this type of project.

Most certainly. However, before I add the code to PyPI, I'd review it, 
anyway, so no worries.

Just be prepared to provide the code as separately-reviewable chunks
of modifications.

> That's why I was working to implement a better unit testing coverage. I
> would like to modernize a little bit the source code of pypi cause i
> think in the future there will some major structural changes of the
> code. Having a great test coverage will allow us to change the code and
> be less afraid of making mistakes.

Alternatively, you could start submitting patches.

> Maybe it would be easier to switch to the official mercurial repository
> (hg.python.org <http://hg.python.org>), it would allow a better
> collaboration between everybody who would like to contribute.

I'm not quite sure why that would be. You still couldn't write to the 
repository, could you? So what would be the difference?

Regards,
Martin

From martin at v.loewis.de  Tue Jun 15 21:46:06 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 15 Jun 2010 21:46:06 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <4C1768AF.9040606@egenix.com>
References: <4C1768AF.9040606@egenix.com>
Message-ID: <4C17D87E.2050609@v.loewis.de>

> PyPI itself has in recent months been mostly maintained by one
> developer: Martin von Loewis.  Projects are underway to enhance PyPI
> in various ways, including a proposal to add external mirroring (PEP
> 381), but these are all far from being finalized or implemented.

That's not at all accurate: PEP 381 is almost completely implemented
in the mirroring tools. Client-side support is missing, but isn't
strictly necessary as users could manually point their setuptools
installation to a mirror.

> While the /simple package listing is currently dynamically created
> from the database in real-time, this is not really needed for normal
> operation. A static copy created every 10-20 minutes would provide the
> same level of service in much the same way.

For normal operation (i.e. on the master copy), this would be really 
insufficient. Users expect, in automated build processes, that the 
packages they upload are available for *immediate* download.

> Under the proposal the static information stored in PyPI
> (meta-information as well as package download files and documentation)
> is moved to a content delivery network (CDN).

There is a good chance that, before that proposal is implemented,
the PEP 381 implementation is completed.

> At the same intervals, another script will scan the package and
> documentation files under /packages for updates and upload any changes
> to the CDN for neartime availability.

Not sure why you wouldn't push every change immediately to the CDN, though.

> Cloudfront itself has been around since Nov 2008.

Please add that Amazon considers Cloudfront as a beta service.

> The pypi.python.org domain would then have to be setup to map to
> multiple IP addresses via DNS round-robin, one entry for each
> redirection server, e.g.
>
>   pypi.python.org. IN A 123.123.123.1
>   pypi.python.org. IN A 123.123.123.1
>   pypi.python.org. IN A 123.123.123.3
>   pypi.python.org. IN A 123.123.123.4

I don't think this works if one of the servers fails (or, worse,
produces a hanging connection). What piece of software would implement 
the fallback to the next machine?

Regards,
Martin

From martin at v.loewis.de  Tue Jun 15 21:48:38 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 15 Jun 2010 21:48:38 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <4C17BBE5.4010901@jcea.es>
References: <4C1768AF.9040606@egenix.com>	<AANLkTikp7gTnJAXxCrXbwxf0C4Lz2h4ka8Udlw0r4tgO@mail.gmail.com>
	<4C17BBE5.4010901@jcea.es>
Message-ID: <4C17D916.8030502@v.loewis.de>

> 1. setuptools&  friends: Support for retrying several mirrors if first
> try fails.

That's the part that still needs to be implemented.

> 2. Packages MUST be digitally signed. Ideally by the owner, but at least
> by PYPI central node (current pypi server). That way, a "rogue" mirror
> can't distribute trojans.

That is already part of the mirroring infrastructure (although still not 
explained in PEP 381 yet).

> 3. Trusting the stats is not possible :(, if there are "rogue" mirrors.

That's true.

Regards,
Martin

From martin at v.loewis.de  Tue Jun 15 21:52:48 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Tue, 15 Jun 2010 21:52:48 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>
References: <4C1768AF.9040606@egenix.com>
	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>
Message-ID: <4C17DA10.8000508@v.loewis.de>

> As Martin von L?wis said, this already exists.
> "a.mirrors.pypi.python.org <http://a.mirrors.pypi.python.org> and
> b.mirrors.pypi.python.org <http://b.mirrors.pypi.python.org> are already
> there and could be used by clients". Maybe Martin can you explain us
> (apologies if this is already done somewhere) how things are working
> from now ? Is this possible to rely on the existing work rather than
> using a cloud system ? What's the in place infrastructure ?

Primarily, client support is missing: i.e. distutils won't fall back 
from one mirror to the next.

As a minor issue, the download stats collection is also not implemented yet.

As for timeliness: it would be reasonable to setup the mirrors so that 
they won't be behind more than one minute (by polling for changes every 
minute). On the one hand, some people claim that this would be much too 
frequent, and that 10 minutes or more would be frequent enough. Others 
claim that changes should be propagated instantaneously. This would also 
be possible (given that the master knows the list of all mirrors),
but would need to be implemented as well.

Regards,
Martin

From martin at v.loewis.de  Tue Jun 15 22:02:45 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 15 Jun 2010 22:02:45 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <4C17BC38.6090208@egenix.com>
References: <4C1768AF.9040606@egenix.com> <4C17B6CE.20209@jcea.es>
	<4C17BC38.6090208@egenix.com>
Message-ID: <4C17DC65.7010707@v.loewis.de>

> Note that with community servers that only mirror once a day,
> you'd have to wait up to a whole day for your package updates
> to become visible worldwide.

However, the community mirrors would mirror every ten minutes, or more 
often. Implementing a push model would be fairly simple.

Regards,
Martin

From martin at v.loewis.de  Tue Jun 15 22:04:55 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 15 Jun 2010 22:04:55 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <4C17C4B5.3000801@jcea.es>
References: <4C1768AF.9040606@egenix.com>
	<4C17B6CE.20209@jcea.es>	<4C17BC38.6090208@egenix.com>
	<4C17C4B5.3000801@jcea.es>
Message-ID: <4C17DCE7.6090802@v.loewis.de>

> I read pep 381 long time ago and I don't remember how/when a mirror
> would update, but I do remember it doesn't mandate digital signatures
> (signed by pypi central node, verified by setuptools&friends). That is a
> big gap, in my opinion.

The PEP doesn't explain the digital signing that is going on in 
mirroring. See

http://mail.python.org/pipermail/catalog-sig/2009-March/002018.html

This is fully implemented (except that client would need to verify the 
signatures, and except key rollover hasn't happened yet).

Regards,
Martin

From mal at egenix.com  Tue Jun 15 22:14:02 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 15 Jun 2010 22:14:02 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <AANLkTik_MDSpgfgxFaEzomiDTWhcb2aBu3eUahRVf8V7@mail.gmail.com>
References: <4C1768AF.9040606@egenix.com>	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>	<4C17A419.4060602@egenix.com>	<AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com>	<4C17B9B2.10006@egenix.com>
	<AANLkTik_MDSpgfgxFaEzomiDTWhcb2aBu3eUahRVf8V7@mail.gmail.com>
Message-ID: <4C17DF0A.3090008@egenix.com>

Tarek Ziad? wrote:
> On Tue, Jun 15, 2010 at 7:34 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> [..]
>>> So I think it would be better to focus on PEP 381, and make those
>>> existing mirrors comply with it. And maybe work on the legal issues
>>> you've mentioned
>>
>> That can all happen in parallel.
> 
> I really doubt it.
> 
> You have come with a cloud proposal and want it to be funded by the PSF.
> 
> Your proposal is basically a proprietary mirroring system, and it competes
> with the mirroring protocol we wanted to build, based on the existing
> mirrors the community has.

I'm not trying to compete with your mirror PEP, just trying
to solve a problem.

> So far I don't see any advantage in a cloud-based mirror managed by the PSF,
> compared to a round of community mirrors.

We can have it up and running in a few days and it doesn't
require any changes to existing client tools, that's the main
argument.

The proposal solves a problem we have now and doesn't get in the
way of PEP 381. Instead it buys it more time to get finalized,
implemented and deployed on the client side.

If you need funding for PEP 381, please write a proposal.
This would then also need to address the problem of added administration
overhead (screening mirror server providers, getting them registered or
removed, monitored and verified for correct operation, etc.).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 15 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                33 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From mal at egenix.com  Tue Jun 15 22:33:15 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 15 Jun 2010 22:33:15 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <4C17DCE7.6090802@v.loewis.de>
References: <4C1768AF.9040606@egenix.com>	<4C17B6CE.20209@jcea.es>	<4C17BC38.6090208@egenix.com>	<4C17C4B5.3000801@jcea.es>
	<4C17DCE7.6090802@v.loewis.de>
Message-ID: <4C17E38B.7050103@egenix.com>

"Martin v. L?wis" wrote:
>> I read pep 381 long time ago and I don't remember how/when a mirror
>> would update, but I do remember it doesn't mandate digital signatures
>> (signed by pypi central node, verified by setuptools&friends). That is a
>> big gap, in my opinion.
> 
> The PEP doesn't explain the digital signing that is going on in
> mirroring. See
> 
> http://mail.python.org/pipermail/catalog-sig/2009-March/002018.html
> 
> This is fully implemented (except that client would need to verify the
> signatures, and except key rollover hasn't happened yet).

That's good to know, but I think some parts of this will have to be
discussed some more:

"""
/serverkey   Public DSA key of the server, in the PEM format
              as generated by "openssl dsa -pubout" (i.e. RFC 3280
              SubjectPublicKeyInfo, with the algorithm 1.3.14.3.2.12).
              This URL must *not* be mirrored, and clients must fetch
              the official serverkey from PyPI directly. The serverkey
"""

* How will clients be sure that they are getting the correct key ?

* What would a client do if the PyPI server is down ?

* How would clients protect their local cached copy of the
  server key against manipulation ?

* Without access to OpenSSL and M2Crypto, how would clients
  apply the check ?

Also, please consider that access to crypto code is restricted
in some parts of the world. Users in those countries would have
to be able to turn off verification.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 15 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                33 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From ziade.tarek at gmail.com  Tue Jun 15 22:46:46 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Tue, 15 Jun 2010 22:46:46 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <4C17DF0A.3090008@egenix.com>
References: <4C1768AF.9040606@egenix.com>
	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>
	<4C17A419.4060602@egenix.com>
	<AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com>
	<4C17B9B2.10006@egenix.com>
	<AANLkTik_MDSpgfgxFaEzomiDTWhcb2aBu3eUahRVf8V7@mail.gmail.com>
	<4C17DF0A.3090008@egenix.com>
Message-ID: <AANLkTimaEw7oK4JmLIK3EcrQyy5CNveDT13lO-Fxdwu4@mail.gmail.com>

On Tue, Jun 15, 2010 at 10:14 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> Tarek Ziad? wrote:
>> On Tue, Jun 15, 2010 at 7:34 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>> [..]
>>>> So I think it would be better to focus on PEP 381, and make those
>>>> existing mirrors comply with it. And maybe work on the legal issues
>>>> you've mentioned
>>>
>>> That can all happen in parallel.
>>
>> I really doubt it.
>>
>> You have come with a cloud proposal and want it to be funded by the PSF.
>>
>> Your proposal is basically a proprietary mirroring system, and it competes
>> with the mirroring protocol we wanted to build, based on the existing
>> mirrors the community has.
>
> I'm not trying to compete with your mirror PEP, just trying
> to solve a problem.

We are trying to solve the same problem, aren't we ?

That is : avoiding any downtime when PyPI is used by setuptools and
derived tools.

So if you solve this problem by implementing a cloud system backed by
a PSF funding,
and managed by the PSF, and if you claim  that there will be no more
downtime, then PEP 381
will be useless.

I am just arguing that I don't think it's the best solution, compared to what
was started e.g. a community network of mirrors.

>
>> So far I don't see any advantage in a cloud-based mirror managed by the PSF,
>> compared to a round of community mirrors.
>
> We can have it up and running in a few days and it doesn't
> require any changes to existing client tools, that's the main
> argument.

The global uptime of PyPI in this last year was probably around 99.9%,
so I don't think we are in such a rush to set up something in any case.

The problem occured in the past, and was fixed in a matter of hours.
every. time.

It's just that everytime it happens it makes us all want to improve the system.

So why don't we implement the best solution ? Maybe we could use a wiki page
and work on a synthetic overview of the pros and cons.

>
> The proposal solves a problem we have now and doesn't get in the
> way of PEP 381. Instead it buys it more time to get finalized,
> implemented and deployed on the client side.
>
> If you need funding for PEP 381, please write a proposal.

I won't.

I think we should decide here, all together, what is the best technical solution
to set up mirrors (e.g. cloud vs community)

Then, ask for its funding from the PSF.


> This would then also need to address the problem of added administration
> overhead (screening mirror server providers, getting them registered or
> removed, monitored and verified for correct operation, etc.).

This overhead is minimum compared to an in-house administration of a
full mirroring
system based on a cloud imho.

Regards
Tarek

-- 
Tarek Ziad? | http://ziade.org

From martin at v.loewis.de  Tue Jun 15 22:48:00 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 15 Jun 2010 22:48:00 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <4C17E38B.7050103@egenix.com>
References: <4C1768AF.9040606@egenix.com>	<4C17B6CE.20209@jcea.es>	<4C17BC38.6090208@egenix.com>	<4C17C4B5.3000801@jcea.es>
	<4C17DCE7.6090802@v.loewis.de> <4C17E38B.7050103@egenix.com>
Message-ID: <4C17E700.1090107@v.loewis.de>

> * How will clients be sure that they are getting the correct key ?

They should initially download it from the master server (when that is 
online) and cache it.

> * What would a client do if the PyPI server is down ?

Isn't that straight-forward?

> * How would clients protect their local cached copy of the
>    server key against manipulation ?

Using standard operating system access control.

> * Without access to OpenSSL and M2Crypto, how would clients
>    apply the check ?

distribute could include a pure-python checking function. The API
was specifically designed to make this possible.

> Also, please consider that access to crypto code is restricted
> in some parts of the world. Users in those countries would have
> to be able to turn off verification.

Most certainly. The simplest approach would be to turn off mirror usage 
in the first place. If you do use mirrors, it is then a matter of your
own risk evaluation whether you want the mirror result verified.

Notice that none of this protects against the master server being 
tempered; the only way to protect against that is to use the PGP signing 
feature in PyPI (which, of course, package authors must use).

Regards,
Martin


From mal at egenix.com  Tue Jun 15 23:03:29 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 15 Jun 2010 23:03:29 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <4C17D87E.2050609@v.loewis.de>
References: <4C1768AF.9040606@egenix.com> <4C17D87E.2050609@v.loewis.de>
Message-ID: <4C17EAA1.5090609@egenix.com>

"Martin v. L?wis" wrote:
>> PyPI itself has in recent months been mostly maintained by one
>> developer: Martin von Loewis.  Projects are underway to enhance PyPI
>> in various ways, including a proposal to add external mirroring (PEP
>> 381), but these are all far from being finalized or implemented.
> 
> That's not at all accurate: PEP 381 is almost completely implemented
> in the mirroring tools. 

Which parts of PEP 381 are implemented ?

> Client-side support is missing, but isn't
> strictly necessary as users could manually point their setuptools
> installation to a mirror.

That's not a good argument. Users like setuptools because
they can run: "easy_install stuff" and let it do whatever it
needs to do.

It's important not to require changes on the client side.

>> While the /simple package listing is currently dynamically created
>> from the database in real-time, this is not really needed for normal
>> operation. A static copy created every 10-20 minutes would provide the
>> same level of service in much the same way.
> 
> For normal operation (i.e. on the master copy), this would be really
> insufficient. Users expect, in automated build processes, that the
> packages they upload are available for *immediate* download.

Power users and developers will probably want that, but those
can hook up to the PyPI server directly if they have such a
need.

For the majority, waiting 10-20 minutes should be fine.

Note that the push idea is part of the plan, but won't happen
in the initial rollout.

>> Under the proposal the static information stored in PyPI
>> (meta-information as well as package download files and documentation)
>> is moved to a content delivery network (CDN).
> 
> There is a good chance that, before that proposal is implemented,
> the PEP 381 implementation is completed.

Including getting all client side package tools updated and
deployed to the existing users ?

>> At the same intervals, another script will scan the package and
>> documentation files under /packages for updates and upload any changes
>> to the CDN for neartime availability.
> 
> Not sure why you wouldn't push every change immediately to the CDN, though.

The proposal wants to do without changing PyPI code where
possible. This is planned for a later release. If this can
be had without any major changes, we can also add it to phase
one.

>> Cloudfront itself has been around since Nov 2008.
> 
> Please add that Amazon considers Cloudfront as a beta service.

I don't think that makes a difference. The "beta" term is
a web 2.0 marketing term, nothing more. But I'll add it anyway.

>> The pypi.python.org domain would then have to be setup to map to
>> multiple IP addresses via DNS round-robin, one entry for each
>> redirection server, e.g.
>>
>>   pypi.python.org. IN A 123.123.123.1
>>   pypi.python.org. IN A 123.123.123.2
>>   pypi.python.org. IN A 123.123.123.3
>>   pypi.python.org. IN A 123.123.123.4
> 
> I don't think this works if one of the servers fails (or, worse,
> produces a hanging connection). What piece of software would implement
> the fallback to the next machine?

AFAIK, the package tools don't currently implement any kind of fail-
over.

While this would be good to have and provide a better
user experience, it's not required. The user would just need
to restart the command and then get a new server IP address
to try - just like you do in a web browser if a page doesn't
load. That's still a lot better than not being able to download
anything at all.

The alternative would be a proxy setup, which then again introduces
a single point of failure (unless you setup a HA cluster).

The mirror PEP shares this problem with the cloud proposal.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 15 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                33 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From ziade.tarek at gmail.com  Tue Jun 15 23:09:22 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Tue, 15 Jun 2010 23:09:22 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C17CE55.5000601@v.loewis.de>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de> <4C12A54D.1070406@egenix.com>
	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>
	<4C14D8E8.4010903@egenix.com>
	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>
	<4C15F5F3.40501@egenix.com>
	<AANLkTineOzOP7Bb9JiAVPulkExOOyLjP8yHTLzmJMPkc@mail.gmail.com>
	<4C176BD4.3080909@egenix.com>
	<AANLkTilsh8ymbJ6JRzF3QxMe0Rv9K-x3_WKZ-wG4_uht@mail.gmail.com>
	<4C17CE55.5000601@v.loewis.de>
Message-ID: <AANLkTilaoMa0OEkwcRzlL2ild5C4fGVf1hg-C5lsQmBu@mail.gmail.com>

On Tue, Jun 15, 2010 at 9:02 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
[..]
> Alternatively, you could start submitting patches.

Some work Matthieu did is already integrated via the branch I worked
on for PEP 345.
And we were considering using the same workflow since I can commit.
Of course, after a while, I wanted to propose Matthieu as a PyPI commiter.

>> Maybe it would be easier to switch to the official mercurial repository
>> (hg.python.org <http://hg.python.org>), it would allow a better
>> collaboration between everybody who would like to contribute.
>
> I'm not quite sure why that would be. You still couldn't write to the
> repository, could you? So what would be the difference?

Not answering instead of Matthieu, but with a DVCS he will be able to wrote
to the repository, and have the same privileges any other commiters have,
as long a

offer are great.

As a maintainer of the PyPI project, it makes your workflow simpler,

- contributors can clone the repo, change the code and ask you for a pull
- you can pull changes by direct hg commands, and merge them


>
> Regards,
> Martin
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>



-- 
Tarek Ziad? | http://ziade.org

From ziade.tarek at gmail.com  Tue Jun 15 23:13:52 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Tue, 15 Jun 2010 23:13:52 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTilaoMa0OEkwcRzlL2ild5C4fGVf1hg-C5lsQmBu@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de> <4C12A54D.1070406@egenix.com>
	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>
	<4C14D8E8.4010903@egenix.com>
	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>
	<4C15F5F3.40501@egenix.com>
	<AANLkTineOzOP7Bb9JiAVPulkExOOyLjP8yHTLzmJMPkc@mail.gmail.com>
	<4C176BD4.3080909@egenix.com>
	<AANLkTilsh8ymbJ6JRzF3QxMe0Rv9K-x3_WKZ-wG4_uht@mail.gmail.com>
	<4C17CE55.5000601@v.loewis.de>
	<AANLkTilaoMa0OEkwcRzlL2ild5C4fGVf1hg-C5lsQmBu@mail.gmail.com>
Message-ID: <AANLkTimws9VwrdoksMxNmW0SM29-L0FC67v0VNvURz5n@mail.gmail.com>

2010/6/15 Tarek Ziad? <ziade.tarek at gmail.com>:
> On Tue, Jun 15, 2010 at 9:02 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> [..]
>> Alternatively, you could start submitting patches.
>
> Some work Matthieu did is already integrated via the branch I worked
> on for PEP 345.
> And we were considering using the same workflow since I can commit.
> Of course, after a while, I wanted to propose Matthieu as a PyPI commiter.
>
>>> Maybe it would be easier to switch to the official mercurial repository
>>> (hg.python.org <http://hg.python.org>), it would allow a better
>>> collaboration between everybody who would like to contribute.
>>
>> I'm not quite sure why that would be. You still couldn't write to the
>> repository, could you? So what would be the difference?


Ooops, sent it to early. Please scratch my previous answer, here's the
finished one ;)

Not answering instead of Mathieu, but with a DVCS he will be able to write
to the repository, and have the same privileges and changeset
granularity any other commiter has,
as long you or any direct commiter pull his changes (after a review
you can do on your side
with simple hg commands)

So the difference is the same I guess, than the difference for Python
itself, which switch to Mercurial.


Regards
Tarek

From martin at v.loewis.de  Tue Jun 15 23:24:23 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 15 Jun 2010 23:24:23 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <4C17EAA1.5090609@egenix.com>
References: <4C1768AF.9040606@egenix.com> <4C17D87E.2050609@v.loewis.de>
	<4C17EAA1.5090609@egenix.com>
Message-ID: <4C17EF87.7090302@v.loewis.de>

>> That's not at all accurate: PEP 381 is almost completely implemented
>> in the mirroring tools.
>
> Which parts of PEP 381 are implemented ?

For the mirrors themselves: everything except for the propagation of
download counters.

> It's important not to require changes on the client side.

I disagree. It's the only way to provide reliably protection against 
server failures. The client code must initiate the fallback, e.g. after 
a timeout.

>> For normal operation (i.e. on the master copy), this would be really
>> insufficient. Users expect, in automated build processes, that the
>> packages they upload are available for *immediate* download.
>
> Power users and developers will probably want that, but those
> can hook up to the PyPI server directly if they have such a
> need.

Under your proposal, how precisely would they do that?

>> There is a good chance that, before that proposal is implemented,
>> the PEP 381 implementation is completed.
>
> Including getting all client side package tools updated and
> deployed to the existing users ?

That depends on how long the proposal requires to get implemented.

However, I don't think it is necessary to have the tools updated
and deployed to all existing users. Instead, it is sufficient that
people who worry about server outages get the tools deployed; for
this, the answer is "yes".

>> Not sure why you wouldn't push every change immediately to the CDN, though.
>
> The proposal wants to do without changing PyPI code where
> possible.

-1000. What's the rationale for not modifying PyPI code?

Are you, by any chance, proposing that this CDN propagation tool does a 
full PyPI traversal every 20 minutes???

> While this would be good to have and provide a better
> user experience, it's not required. The user would just need
> to restart the command and then get a new server IP address
> to try - just like you do in a web browser if a page doesn't
> load. That's still a lot better than not being able to download
> anything at all.

I think this depends a lot on the client setup. For example, on
my machine, I don't get a different IP address for www.google.com
each time, using the DNS server in my Fritzbox router.

> The mirror PEP shares this problem with the cloud proposal.

Except that it gives the client the explicit choice which copy to get 
the data from.

Regards,
Martin


From mal at egenix.com  Tue Jun 15 23:24:51 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 15 Jun 2010 23:24:51 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <AANLkTimaEw7oK4JmLIK3EcrQyy5CNveDT13lO-Fxdwu4@mail.gmail.com>
References: <4C1768AF.9040606@egenix.com>	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>	<4C17A419.4060602@egenix.com>	<AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com>	<4C17B9B2.10006@egenix.com>	<AANLkTik_MDSpgfgxFaEzomiDTWhcb2aBu3eUahRVf8V7@mail.gmail.com>	<4C17DF0A.3090008@egenix.com>
	<AANLkTimaEw7oK4JmLIK3EcrQyy5CNveDT13lO-Fxdwu4@mail.gmail.com>
Message-ID: <4C17EFA3.6050204@egenix.com>

Tarek Ziad? wrote:
> On Tue, Jun 15, 2010 at 10:14 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>>
>> I'm not trying to compete with your mirror PEP, just trying
>> to solve a problem.
> 
> We are trying to solve the same problem, aren't we ?

Sure, but the intent is not to compete with the PEP. Even with
the cloud proposal implemented, we can still have a mirror
setup like the one you propose.

> That is : avoiding any downtime when PyPI is used by setuptools and
> derived tools.
>
> So if you solve this problem by implementing a cloud system backed by
> a PSF funding,
> and managed by the PSF, and if you claim  that there will be no more
> downtime, then PEP 381
> will be useless.

No, not at all. The PSF would not be the only user of the PEP
and the client tools.

If all client tools implement the things you suggested in the PEP,
we'd have a lot more possibilities.

> I am just arguing that I don't think it's the best solution, compared to what
> was started e.g. a community network of mirrors.

I've heard you, but still disagree. I think we'll just have to
leave it at that.

>>> So far I don't see any advantage in a cloud-based mirror managed by the PSF,
>>> compared to a round of community mirrors.
>>
>> We can have it up and running in a few days and it doesn't
>> require any changes to existing client tools, that's the main
>> argument.
> 
> The global uptime of PyPI in this last year was probably around 99.9%,
> so I don't think we are in such a rush to set up something in any case.
>
> The problem occured in the past, and was fixed in a matter of hours.
> every. time.
> 
> It's just that everytime it happens it makes us all want to improve the system.
> 
> So why don't we implement the best solution ? Maybe we could use a wiki page
> and work on a synthetic overview of the pros and cons.

Again: I don't want to compete against the PEP. I'm looking
for a solution that's easy to implement and doesn't get in the
way. That's all. Nothing more.

If you can come up with a solution that's ready in a month or two,
I'll happily wait.

>> The proposal solves a problem we have now and doesn't get in the
>> way of PEP 381. Instead it buys it more time to get finalized,
>> implemented and deployed on the client side.
>>
>> If you need funding for PEP 381, please write a proposal.
> 
> I won't.
> 
> I think we should decide here, all together, what is the best technical solution
> to set up mirrors (e.g. cloud vs community)
> 
> Then, ask for its funding from the PSF.
> 
> 
>> This would then also need to address the problem of added administration
>> overhead (screening mirror server providers, getting them registered or
>> removed, monitored and verified for correct operation, etc.).
> 
> This overhead is minimum compared to an in-house administration of a
> full mirroring system based on a cloud imho.

YMMV, but my experience with these systems is that they cause a lot
less overhead than anything you administer yourself.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 15 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                33 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From mal at egenix.com  Tue Jun 15 23:26:58 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 15 Jun 2010 23:26:58 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <4C17E700.1090107@v.loewis.de>
References: <4C1768AF.9040606@egenix.com>	<4C17B6CE.20209@jcea.es>	<4C17BC38.6090208@egenix.com>	<4C17C4B5.3000801@jcea.es>	<4C17DCE7.6090802@v.loewis.de>
	<4C17E38B.7050103@egenix.com> <4C17E700.1090107@v.loewis.de>
Message-ID: <4C17F022.7050707@egenix.com>

"Martin v. L?wis" wrote:
>> * How will clients be sure that they are getting the correct key ?
> 
> They should initially download it from the master server (when that is
> online) and cache it.

So they'll use HTTPS and check the server certificate
as well ?

>> * What would a client do if the PyPI server is down ?
> 
> Isn't that straight-forward?

If the local cache doesn't have the server key, the tools
would have to download it from somewhere and if the main server
is down, that's not possible, so you reintroduce a single
point of failure.

>> * How would clients protect their local cached copy of the
>>    server key against manipulation ?
> 
> Using standard operating system access control.

So clients will have to be careful to get this right.

>> * Without access to OpenSSL and M2Crypto, how would clients
>>    apply the check ?
> 
> distribute could include a pure-python checking function. The API
> was specifically designed to make this possible.

Do you have a pure-Python DSA and PEM/DER parsing function
available ? Wouldn't a set of hex dumps be easier to parse ?

>> Also, please consider that access to crypto code is restricted
>> in some parts of the world. Users in those countries would have
>> to be able to turn off verification.
> 
> Most certainly. The simplest approach would be to turn off mirror usage
> in the first place. If you do use mirrors, it is then a matter of your
> own risk evaluation whether you want the mirror result verified.
> 
> Notice that none of this protects against the master server being
> tempered; the only way to protect against that is to use the PGP signing
> feature in PyPI (which, of course, package authors must use).

Right, it's just an end-to-end authentication.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 15 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                33 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From martin at v.loewis.de  Tue Jun 15 23:28:05 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 15 Jun 2010 23:28:05 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTilaoMa0OEkwcRzlL2ild5C4fGVf1hg-C5lsQmBu@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>	<4C12A2E4.2090305@v.loewis.de>	<4C12A54D.1070406@egenix.com>	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>	<4C14D8E8.4010903@egenix.com>	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>	<4C15F5F3.40501@egenix.com>	<AANLkTineOzOP7Bb9JiAVPulkExOOyLjP8yHTLzmJMPkc@mail.gmail.com>	<4C176BD4.3080909@egenix.com>	<AANLkTilsh8ymbJ6JRzF3QxMe0Rv9K-x3_WKZ-wG4_uht@mail.gmail.com>	<4C17CE55.5000601@v.loewis.de>
	<AANLkTilaoMa0OEkwcRzlL2ild5C4fGVf1hg-C5lsQmBu@mail.gmail.com>
Message-ID: <4C17F065.7070309@v.loewis.de>

> As a maintainer of the PyPI project, it makes your workflow simpler,
>
> - contributors can clone the repo, change the code and ask you for a pull
> - you can pull changes by direct hg commands, and merge them

After using Mercurial in one project, I'm skeptical that this really 
makes things simpler. I find it very hard to find out what changes a 
specific clone has that I still need to integrate. Also, when merging 
with conflicts, I find it very difficult to determine whether I merged 
all the conflicts correctly (since the diff will show all changes, not 
just the conflicts).

So I rather expect things to become more difficult when switching to hg.

Regards,
Martin

From martin at v.loewis.de  Tue Jun 15 23:39:05 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 15 Jun 2010 23:39:05 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <4C17F022.7050707@egenix.com>
References: <4C1768AF.9040606@egenix.com>	<4C17B6CE.20209@jcea.es>	<4C17BC38.6090208@egenix.com>	<4C17C4B5.3000801@jcea.es>	<4C17DCE7.6090802@v.loewis.de>
	<4C17E38B.7050103@egenix.com> <4C17E700.1090107@v.loewis.de>
	<4C17F022.7050707@egenix.com>
Message-ID: <4C17F2F9.6020401@v.loewis.de>

>>> * How will clients be sure that they are getting the correct key ?
>>
>> They should initially download it from the master server (when that is
>> online) and cache it.
>
> So they'll use HTTPS and check the server certificate
> as well ?

No. But they trust that the package contents is untampered when they 
download from the central copy, so they should also trust that the 
server key is untampered.

If some attack could arrange to modify the server key (either during 
transmission, or afterwards), the same threat applies to the actual 
packages. So this doesn't add any new risk.

>>> * What would a client do if the PyPI server is down ?
>>
>> Isn't that straight-forward?
>
> If the local cache doesn't have the server key, the tools
> would have to download it from somewhere and if the main server
> is down, that's not possible, so you reintroduce a single
> point of failure.

That wouldn't be a problem, since one copy of the server key could ship 
with setuptools/distribute itself. So people who have never used it 
before could still validate the mirrors.

>>> * How would clients protect their local cached copy of the
>>>     server key against manipulation ?
>>
>> Using standard operating system access control.
>
> So clients will have to be careful to get this right.

Not anymore than they do for the actual package data.

>>> * Without access to OpenSSL and M2Crypto, how would clients
>>>     apply the check ?
>>
>> distribute could include a pure-python checking function. The API
>> was specifically designed to make this possible.
>
> Do you have a pure-Python DSA and PEM/DER parsing function
> available ? Wouldn't a set of hex dumps be easier to parse ?

See tools/verify.py.

Regards,
Martin

From ziade.tarek at gmail.com  Tue Jun 15 23:39:35 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Tue, 15 Jun 2010 23:39:35 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C17F065.7070309@v.loewis.de>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de> <4C12A54D.1070406@egenix.com>
	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>
	<4C14D8E8.4010903@egenix.com>
	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>
	<4C15F5F3.40501@egenix.com>
	<AANLkTineOzOP7Bb9JiAVPulkExOOyLjP8yHTLzmJMPkc@mail.gmail.com>
	<4C176BD4.3080909@egenix.com>
	<AANLkTilsh8ymbJ6JRzF3QxMe0Rv9K-x3_WKZ-wG4_uht@mail.gmail.com>
	<4C17CE55.5000601@v.loewis.de>
	<AANLkTilaoMa0OEkwcRzlL2ild5C4fGVf1hg-C5lsQmBu@mail.gmail.com>
	<4C17F065.7070309@v.loewis.de>
Message-ID: <AANLkTikqUwIW81qwvGLZ-3laHREi21hv3SqPqFnSpLSz@mail.gmail.com>

2010/6/15 "Martin v. L?wis" <martin at v.loewis.de>:
>> As a maintainer of the PyPI project, it makes your workflow simpler,
>>
>> - contributors can clone the repo, change the code and ask you for a pull
>> - you can pull changes by direct hg commands, and merge them
>
> After using Mercurial in one project, I'm skeptical that this really makes
> things simpler. I find it very hard to find out what changes a specific
> clone has that I still need to integrate.

If the clone is used as an unsynced copy of the repository for various
works, you are right, it can become a nightmare !

I think the best practice is to make sure the clone is a fresh synced
one, containing only the commits you want to push
in the "main" repo so the reviewer has a clean understanding when you
ask for the pull;

An alternative approach is to use the queue system Mercurial has,
which are commands that create patches
you can then send to the reviewers. You can even use a tool like
CodeReview in that case. Then I guess I doesn't
really matter if the main repo is svn or hg..


> Also, when merging with conflicts,
> I find it very difficult to determine whether I merged all the conflicts
> correctly (since the diff will show all changes, not just the conflicts).
>
> So I rather expect things to become more difficult when switching to hg.

Well I guess it's up to you anyway :)

> Regards,
> Martin
>



-- 
Tarek Ziad? | http://ziade.org

From simon at ikanobori.jp  Tue Jun 15 23:50:43 2010
From: simon at ikanobori.jp (Simon de Vlieger)
Date: Tue, 15 Jun 2010 23:50:43 +0200
Subject: [Catalog-sig] Fwd: [Distutils] Proposal: Move PyPI static data to
	the cloud for better availability
References: <4C17DBD2.20509@v.loewis.de>
Message-ID: <6E682DB9-5750-4D42-813F-4872FF42565D@ikanobori.jp>

I am forwarding this message as I initially posted my message to the  
wrong mailinglist (distutils) and Martin has responded on that list.

Begin forwarded message:

> From: "Martin v. L?wis" <martin at v.loewis.de>
> Date: 15 juni 2010 22:00:18 GMT+02:00
> To: Simon de Vlieger <simon at ikanobori.jp>
> Cc: Mathieu Leduc-Hamel <marrakis at gmail.com>, distutils-sig at python.org
> Subject: Re: [Distutils] [Catalog-sig] Proposal: Move PyPI static  
> data to the cloud for better availability
>
>> Is there any Nagios monitoring in place or is there the need to have
>> some external reliability monitoring in place?
>
> There is no external monitoring in place that I know of. I know ZC  
> had some monitoring that was supposed to send me an email, but that  
> was setup a few years ago, and recently didn't report the downtime.
>
> My own mirroring reported the downtime (indirectly, by reporting  
> that it couldn't mirror anymore); this is how I noticed one of the  
> recent outages.
>
>> I can set up a Nagios machine to check the HTTP status of PyPi.
>
> If it's easy to setup: why not? What exactly would that check?
>
>>> As you said, we may have the same problem in the future on all
>>> mirroring nodes ...
>>
>> Yes, there should be some more investigative work be done on the  
>> reason
>> of the apparent unreliability.
>
> The pep381mirror software produces a set of static files on the  
> mirror, so you don't need to run PyPI itself. I merely use Apache to  
> serve the PyPI mirrors.
>
> Regards,
> Martin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100615/bcca0342/attachment-0001.html>

From marrakis at gmail.com  Tue Jun 15 23:55:07 2010
From: marrakis at gmail.com (Mathieu Leduc-Hamel)
Date: Tue, 15 Jun 2010 23:55:07 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C17CE55.5000601@v.loewis.de>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de> <4C12A54D.1070406@egenix.com>
	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>
	<4C14D8E8.4010903@egenix.com>
	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>
	<4C15F5F3.40501@egenix.com>
	<AANLkTineOzOP7Bb9JiAVPulkExOOyLjP8yHTLzmJMPkc@mail.gmail.com>
	<4C176BD4.3080909@egenix.com>
	<AANLkTilsh8ymbJ6JRzF3QxMe0Rv9K-x3_WKZ-wG4_uht@mail.gmail.com>
	<4C17CE55.5000601@v.loewis.de>
Message-ID: <AANLkTikuYn0XoVHBCd4CKew5xvI8JJdXzKOEmjCXmn7W@mail.gmail.com>

>
> Just be prepared to provide the code as separately-reviewable chunks
>
> of modifications.
>
>
That's exactly the point. I may be wrong but me and people want to
contribute and it's exactly what project like Bitbucket and code review
tools allow.

I worked with people of a very wide range of experience at our local python
user group and one  common complain is that it's alway difficult to
contribute.

Using a DVCS is exactly one good way to deal with merges and code review.

I'm not asking to have a commiter access right away.  I just want to be able
to contribute cause I'm open to work on something that needed to be done.

>
> Alternatively, you could start submitting patches.
>
> I'm not quite sure why that would be. You still couldn't write to the
> repository, could you? So what would be the difference?
>

For sure, right now i worked on Tarek repos and he is responsible to merge
on the main svn repos and the production server of pypi. Having complete
mercurial workflow would be easier...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100615/407e7c71/attachment.html>

From jcea at jcea.es  Tue Jun 15 23:55:32 2010
From: jcea at jcea.es (Jesus Cea)
Date: Tue, 15 Jun 2010 23:55:32 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <AANLkTik-1Z6vecE5b83fKNA-qo687EiQUQDq5CIM5Oo5@mail.gmail.com>
References: <4C1768AF.9040606@egenix.com>
	<4C17B6CE.20209@jcea.es>	<4C17BC38.6090208@egenix.com>
	<4C17C4B5.3000801@jcea.es>
	<AANLkTik-1Z6vecE5b83fKNA-qo687EiQUQDq5CIM5Oo5@mail.gmail.com>
Message-ID: <4C17F6D4.2050504@jcea.es>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 15/06/10 20:52, Tarek Ziad? wrote:
> Do you trust the package you are installing more than an "official"
> mirror ? if so, why ?

If a package is signed by the author, I only need to "trust" the author.

If a package is not signed in PYPI, I must "trust" the author, PYPI
admins and pypi machines security.

If I download from a mirror, with no digital signature, I must trust the
author, PYPI admins, pypi machines security, mirror admins, mirror
machine security and mirror replication protocol. And all network
connections and harddisks in between.

It is just me, call me paranoid, but I pay close attention to where the
package being installed by "easy_install" is pulled from. I have
documented where each package used to live and I check carefully when I
see an unexpected URL. And I freak out when I package upgrade includes
new dependencies I haven't seen before.

> Anyone can upload a package at PyPI with
>
>   os.system('rm -rf /')
>
> in its setup.py...

True. And SCARY. Fortunatelly I only install packages I am interested
in, check signatures, etc. Of course, I can be hacked if the original
autor put a trojan in the package, or he/she was hacked before. But my
exposure is smaller that if I must trust too every link in a LONG chain
of mirrors.

Just check his link, for a recent example:

<http://it.slashdot.org/firehose.pl?op=view&type=story&sid=10/06/13/0046256>

The trojan was not in the original sourcecode, but in an altered mirror
version.

Asking for pypi central node to add signatures is a trivial way of
avoiding this issue. The question is not to trust or not to trust
mirrors, but that we have technology to be safe even if the mirrors are
not trusted. I don't NEED to trust you to be safe. I am happy!.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at jcea.es - http://www.jcea.es/     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at jabber.org         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBTBf21Jlgi5GaxT1NAQLPngP+NfLf7js3ni9FvoDjkrzOB0AmRIyfmDJm
tm0wNEVIlTY+d3st76Gd62ET+VxtgNHfWyNQ82Zp0iAISoWlpDyflJlZ1r5oVjAR
sWOSntdXXZAaaxOkumggi1cHKVCbWAe+62fGctTLWt4QtP4557yJDHZO1LKp1nWe
qtHX5LyUD5k=
=yGPk
-----END PGP SIGNATURE-----

From ziade.tarek at gmail.com  Wed Jun 16 00:01:58 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Wed, 16 Jun 2010 00:01:58 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <4C17EFA3.6050204@egenix.com>
References: <4C1768AF.9040606@egenix.com>
	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>
	<4C17A419.4060602@egenix.com>
	<AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com>
	<4C17B9B2.10006@egenix.com>
	<AANLkTik_MDSpgfgxFaEzomiDTWhcb2aBu3eUahRVf8V7@mail.gmail.com>
	<4C17DF0A.3090008@egenix.com>
	<AANLkTimaEw7oK4JmLIK3EcrQyy5CNveDT13lO-Fxdwu4@mail.gmail.com>
	<4C17EFA3.6050204@egenix.com>
Message-ID: <AANLkTikVb7NPEmC8MYjvSBagjHeKVSoANmxZFpe76_7Y@mail.gmail.com>

On Tue, Jun 15, 2010 at 11:24 PM, M.-A. Lemburg <mal at egenix.com> wrote:
[..]
>> I am just arguing that I don't think it's the best solution, compared to what
>> was started e.g. a community network of mirrors.
>
> I've heard you, but still disagree. I think we'll just have to
> leave it at that.

Sure. Although, I am pretty sure we will come up with a consensus here
at some point :)

[..]
>> So why don't we implement the best solution ? Maybe we could use a wiki page
>> and work on a synthetic overview of the pros and cons.
>
> Again: I don't want to compete against the PEP. I'm looking
> for a solution that's easy to implement and doesn't get in the
> way. That's all. Nothing more.
>
> If you can come up with a solution that's ready in a month or two,
> I'll happily wait.

If I understood Martin correctly, he did some work and things are looking good;
so I'll let him answer.

Adding the failover part in distribute/pip shouldn't be too long
though,  falling back to a mirror is
a small change.

What's important also, is to make sure z3c.pypimirror includes the
server-side work,
so existing mirrors can be upgraded.


Regards
Tarek

-- 
Tarek Ziad? | http://ziade.org

From jcea at jcea.es  Wed Jun 16 00:07:16 2010
From: jcea at jcea.es (Jesus Cea)
Date: Wed, 16 Jun 2010 00:07:16 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <4C17DA10.8000508@v.loewis.de>
References: <4C1768AF.9040606@egenix.com>	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>
	<4C17DA10.8000508@v.loewis.de>
Message-ID: <4C17F994.2010000@jcea.es>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 15/06/10 21:52, "Martin v. L?wis" wrote:
> As for timeliness: it would be reasonable to setup the mirrors so that
> they won't be behind more than one minute (by polling for changes every
> minute). On the one hand, some people claim that this would be much too
> frequent, and that 10 minutes or more would be frequent enough. Others
> claim that changes should be propagated instantaneously. This would also
> be possible (given that the master knows the list of all mirrors),
> but would need to be implemented as well.

WebHooks: <http://webhooks.pbworks.com/>

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at jcea.es - http://www.jcea.es/     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at jabber.org         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBTBf5lJlgi5GaxT1NAQKY9wP/X96mYFA2BWpiQVbuQ+bKM9TWrIZzo+49
jCZFVN67LxecKbhvPvuO1XUCMpECiyl0ycowTUC00+Q+gJIm1TMzw5gJPdh2avy5
kZk31rEmWVIhWN+AclzSgK6CJxZ6Y9YnVsySs185YfM+BpVanjwBma73rU3Vrq0x
zLxjHGXDLJI=
=1Jcr
-----END PGP SIGNATURE-----

From jcea at jcea.es  Wed Jun 16 00:11:14 2010
From: jcea at jcea.es (Jesus Cea)
Date: Wed, 16 Jun 2010 00:11:14 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <4C17DCE7.6090802@v.loewis.de>
References: <4C1768AF.9040606@egenix.com>	<4C17B6CE.20209@jcea.es>	<4C17BC38.6090208@egenix.com>	<4C17C4B5.3000801@jcea.es>
	<4C17DCE7.6090802@v.loewis.de>
Message-ID: <4C17FA82.1000104@jcea.es>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 15/06/10 22:04, "Martin v. L?wis" wrote:
>> I read pep 381 long time ago and I don't remember how/when a mirror
>> would update, but I do remember it doesn't mandate digital signatures
>> (signed by pypi central node, verified by setuptools&friends). That is a
>> big gap, in my opinion.
> 
> The PEP doesn't explain the digital signing that is going on in
> mirroring. See
> 
> http://mail.python.org/pipermail/catalog-sig/2009-March/002018.html
> 
> This is fully implemented (except that client would need to verify the
> signatures, and except key rollover hasn't happened yet).

Could I ask pep381 to be updated?.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at jcea.es - http://www.jcea.es/     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at jabber.org         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBTBf6gplgi5GaxT1NAQJh6AP/T0pyein9GQ2ZmsL1JOxQOdGMhZfg7Jxu
go2WuHgrV2Jog7koQFDaX0y/gwTonW5w9AWRcsbQTbOL+ss9JUMgAvd2aSRhWMu2
SQrTsbimuJwHwPbVLRzV3HS6NsgzJgwIEexjmJ1a6kVKvbwOL3RsOqgMyK8/5ka2
V2cWn//0Jzc=
=Rplg
-----END PGP SIGNATURE-----

From martin at v.loewis.de  Wed Jun 16 00:17:13 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 16 Jun 2010 00:17:13 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <AANLkTikVb7NPEmC8MYjvSBagjHeKVSoANmxZFpe76_7Y@mail.gmail.com>
References: <4C1768AF.9040606@egenix.com>	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>	<4C17A419.4060602@egenix.com>	<AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com>	<4C17B9B2.10006@egenix.com>	<AANLkTik_MDSpgfgxFaEzomiDTWhcb2aBu3eUahRVf8V7@mail.gmail.com>	<4C17DF0A.3090008@egenix.com>	<AANLkTimaEw7oK4JmLIK3EcrQyy5CNveDT13lO-Fxdwu4@mail.gmail.com>	<4C17EFA3.6050204@egenix.com>
	<AANLkTikVb7NPEmC8MYjvSBagjHeKVSoANmxZFpe76_7Y@mail.gmail.com>
Message-ID: <4C17FBE9.8040400@v.loewis.de>

> What's important also, is to make sure z3c.pypimirror includes the
> server-side work, so existing mirrors can be upgraded.

Not really. z3c.pypimirror has a completely different function. 
Operators providing one of the official PyPI mirrors should use 
pep381client instead.

Of course, if people absolutely want to, they could also put PEP 381 
support in z3c.pypimirror, but that may result in a significant rewrite.

Regards,
Martin

From martin at v.loewis.de  Wed Jun 16 00:19:36 2010
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 16 Jun 2010 00:19:36 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <4C17FA82.1000104@jcea.es>
References: <4C1768AF.9040606@egenix.com>	<4C17B6CE.20209@jcea.es>	<4C17BC38.6090208@egenix.com>	<4C17C4B5.3000801@jcea.es>	<4C17DCE7.6090802@v.loewis.de>
	<4C17FA82.1000104@jcea.es>
Message-ID: <4C17FC78.2090700@v.loewis.de>

> Could I ask pep381 to be updated?.

Sure you can ask. So did I.

Regards,
Martin

From jcea at jcea.es  Wed Jun 16 00:20:15 2010
From: jcea at jcea.es (Jesus Cea)
Date: Wed, 16 Jun 2010 00:20:15 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <4C17E38B.7050103@egenix.com>
References: <4C1768AF.9040606@egenix.com>	<4C17B6CE.20209@jcea.es>	<4C17BC38.6090208@egenix.com>	<4C17C4B5.3000801@jcea.es>
	<4C17DCE7.6090802@v.loewis.de> <4C17E38B.7050103@egenix.com>
Message-ID: <4C17FC9F.4070507@jcea.es>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 15/06/10 22:33, M.-A. Lemburg wrote:
> * How will clients be sure that they are getting the correct key ?

Err... Download from a HTTPS server, with certificate verification in
the client, would be nice :).

> * What would a client do if the PyPI server is down ?

I would keep using the old key if I can't refresh it. If the key is
changed once per year, that would be painless most of the time.

> * How would clients protect their local cached copy of the
>   server key against manipulation ?

Well, if you can alter the local cached key, you can alter too the
client code to skip the verification completely.

> * Without access to OpenSSL and M2Crypto, how would clients
>   apply the check ?

Time ago I proposed to use ?Elgamal? signatures. The check can be done
in pure Python in maybe 5 lines of code. I use this in my own projects.

> Also, please consider that access to crypto code is restricted
> in some parts of the world. Users in those countries would have
> to be able to turn off verification.

Not for verification, I think. If the verification is 100% python, with
no crypto library required, less legal risk.

Personally I would ban mirrors deployed in no-crypto countries, if I can
not "certify" the files they are serving.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at jcea.es - http://www.jcea.es/     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at jabber.org         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBTBf8n5lgi5GaxT1NAQJR6AP6A45T2KF7k6v60w8fa2oH5ZBK/7x3lOgI
RQT69ftWwZT+ifPnhJlOMAJ+Xq7F18PL3uOwgsj1Ce12KjimkHPnrOy09+/TblOL
Hy0hijddktcAdaaPwBOgE1sOL2ffPsXUk0afKJzPOzYIqFzdqzpb49DYH6vvwsuh
I4jJT12x3Ps=
=8SNq
-----END PGP SIGNATURE-----

From martin at v.loewis.de  Wed Jun 16 00:23:10 2010
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 16 Jun 2010 00:23:10 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <4C17F994.2010000@jcea.es>
References: <4C1768AF.9040606@egenix.com>	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>	<4C17DA10.8000508@v.loewis.de>
	<4C17F994.2010000@jcea.es>
Message-ID: <4C17FD4E.6030005@v.loewis.de>

> WebHooks:<http://webhooks.pbworks.com/>

Exactly so. Still, it requires a non-static web server.

Also, with a push model, it's more difficult for the client to determine 
whether the server is current. In a pull model, the client can look at 
the last synchronization timestamp, and determine whether that's good 
enough.

Of course, if you trust that the push actually works, you could fake the 
synchronization timestamp if no sync operation is going on.

Regards,
Martin

From ziade.tarek at gmail.com  Wed Jun 16 00:27:57 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Wed, 16 Jun 2010 00:27:57 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTikuYn0XoVHBCd4CKew5xvI8JJdXzKOEmjCXmn7W@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de> <4C12A54D.1070406@egenix.com>
	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>
	<4C14D8E8.4010903@egenix.com>
	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>
	<4C15F5F3.40501@egenix.com>
	<AANLkTineOzOP7Bb9JiAVPulkExOOyLjP8yHTLzmJMPkc@mail.gmail.com>
	<4C176BD4.3080909@egenix.com>
	<AANLkTilsh8ymbJ6JRzF3QxMe0Rv9K-x3_WKZ-wG4_uht@mail.gmail.com>
	<4C17CE55.5000601@v.loewis.de>
	<AANLkTikuYn0XoVHBCd4CKew5xvI8JJdXzKOEmjCXmn7W@mail.gmail.com>
Message-ID: <AANLkTilqUWQE4dpq8JNduHjhAnct5sgKGU8X3UNPhn69@mail.gmail.com>

On Tue, Jun 15, 2010 at 11:55 PM, Mathieu Leduc-Hamel
<marrakis at gmail.com> wrote:
>>> Just be prepared to provide the code as separately-reviewable chunks
>>
>> of modifications.
>
> That's exactly the point. I may be wrong but me and people want to
> contribute and it's exactly what project like Bitbucket and code review
> tools allow.
> I worked with people of a very wide range of experience at our local python
> user group and one ?common complain is that it's alway difficult to
> contribute.
> Using a DVCS is exactly one good way to deal with merges and code review.
> I'm not asking to have a commiter access right away. ?I just want to be able
> to contribute cause I'm open to work on something that needed to be done.
>>
>> Alternatively, you could start submitting patches.
>>
>> I'm not quite sure why that would be. You still couldn't write to the
>> repository, could you? So what would be the difference?
>
> For sure, right now i worked on Tarek repos and he is responsible to merge
> on the main svn repos and the production server of pypi. Having complete
> mercurial workflow would be easier...

Note that Martin is doing the final step (checking the changes before
they go in production
and updating the production server).

From steve at pearwood.info  Wed Jun 16 00:24:23 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 16 Jun 2010 08:24:23 +1000
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <4C17BBE5.4010901@jcea.es>
References: <4C1768AF.9040606@egenix.com>
	<AANLkTikp7gTnJAXxCrXbwxf0C4Lz2h4ka8Udlw0r4tgO@mail.gmail.com>
	<4C17BBE5.4010901@jcea.es>
Message-ID: <201006160824.23449.steve@pearwood.info>

On Wed, 16 Jun 2010 03:44:05 am Jesus Cea wrote:

> 2. Packages MUST be digitally signed. Ideally by the owner

-1 on requiring that by the package owner. While digitally signing 
packages is a good idea, the state of the art is not yet so simple that 
this will be anything but a barrier to entry to the average Python 
developer. Not to mention there are places in the world where effective 
encryption is illegal.


> but at least by PYPI central node (current pypi server). 

Martin has said this is already planned, and linked here:

http://mail.python.org/pipermail/catalog-sig/2009-March/002018.html

Has anyone considered whether there are any legal implications of this?

A digital signature is not an MD5 checksum, it may have actual legal 
meaning in many countries equivalent to a pen and paper signature. 
IANAL but I do not believe that it is a good idea to be signing 
arbitrary packages without knowing what they are (other than "a bunch 
of bytes uploaded from some arbitrary IP address") any more than I 
would put my physical signature on a parcel handed to me by some random 
person at the airport.

I would not be digitally signing anything I didn't create unless I had 
good legal advice that it was safe to do so.



-- 
Steven D'Aprano

From justinc at cs.washington.edu  Wed Jun 16 00:32:39 2010
From: justinc at cs.washington.edu (Justin Cappos)
Date: Tue, 15 Jun 2010 15:32:39 -0700
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <4C17F6D4.2050504@jcea.es>
References: <4C1768AF.9040606@egenix.com> <4C17B6CE.20209@jcea.es>
	<4C17BC38.6090208@egenix.com> <4C17C4B5.3000801@jcea.es>
	<AANLkTik-1Z6vecE5b83fKNA-qo687EiQUQDq5CIM5Oo5@mail.gmail.com>
	<4C17F6D4.2050504@jcea.es>
Message-ID: <AANLkTilH12k89h5SwCW8CKUue3JASTESmzzxB_l-IOAU@mail.gmail.com>

On Tue, Jun 15, 2010 at 2:55 PM, Jesus Cea <jcea at jcea.es> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 15/06/10 20:52, Tarek Ziad? wrote:
>> Do you trust the package you are installing more than an "official"
>> mirror ? if so, why ?
>
> If a package is signed by the author, I only need to "trust" the author.

I think it might not be this simple.   You're still trusting PYPI to
provide you with the latest version of a package.   Absent other
mechanisms, you don't have a way to tell if the file you're being
served is actually a version that is obsolete (possibly due to
security flaws).

Also, in practice many package managers perform dependency resolution
based upon on metadata that isn't signed with the author's GPG key.
http://www.cs.arizona.edu/stork/packagemanagersecurity/otherattacks.html#extradep


Is the plan to use what is proposed in
http://mail.python.org/pipermail/catalog-sig/2009-March/002018.html in
practice?   Is more information available about this?   Does this
protect against man-in-the-middle attacks?


> If a package is not signed in PYPI, I must "trust" the author, PYPI
> admins and pypi machines security.
>
> If I download from a mirror, with no digital signature, I must trust the
> author, PYPI admins, pypi machines security, mirror admins, mirror
> machine security and mirror replication protocol. And all network
> connections and harddisks in between.
>
> It is just me, call me paranoid, but I pay close attention to where the
> package being installed by "easy_install" is pulled from. I have
> documented where each package used to live and I check carefully when I
> see an unexpected URL. And I freak out when I package upgrade includes
> new dependencies I haven't seen before.
>
>> Anyone can upload a package at PyPI with
>>
>> ? os.system('rm -rf /')
>>
>> in its setup.py...
>
> True. And SCARY. Fortunatelly I only install packages I am interested
> in, check signatures, etc. Of course, I can be hacked if the original
> autor put a trojan in the package, or he/she was hacked before. But my
> exposure is smaller that if I must trust too every link in a LONG chain
> of mirrors.
>
> Just check his link, for a recent example:
>
> <http://it.slashdot.org/firehose.pl?op=view&type=story&sid=10/06/13/0046256>
>
> The trojan was not in the original sourcecode, but in an altered mirror
> version.
>
> Asking for pypi central node to add signatures is a trivial way of
> avoiding this issue. The question is not to trust or not to trust
> mirrors, but that we have technology to be safe even if the mirrors are
> not trusted. I don't NEED to trust you to be safe. I am happy!.

I think there are other subtle issues here dealing with key
revocation, mismatching of package versions, etc.

A lot of these issues are pretty subtle and I'd be happy to talk in
more detail about how one might address them.   In fact, we have a
project that is trying to do so:
https://www.updateframework.com/


Geremy do you want to chime in?

Thanks,
Justin


> - --
> Jesus Cea Avion ? ? ? ? ? ? ? ? ? ? ? ? _/_/ ? ? ?_/_/_/ ? ? ? ?_/_/_/
> jcea at jcea.es - http://www.jcea.es/ ? ? _/_/ ? ?_/_/ ?_/_/ ? ?_/_/ ?_/_/
> jabber / xmpp:jcea at jabber.org ? ? ? ? _/_/ ? ?_/_/ ? ? ? ? ?_/_/_/_/_/
> . ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?_/_/ ?_/_/ ? ?_/_/ ? ? ? ? ?_/_/ ?_/_/
> "Things are not so easy" ? ? ?_/_/ ?_/_/ ? ?_/_/ ?_/_/ ? ?_/_/ ?_/_/
> "My name is Dump, Core Dump" ? _/_/_/ ? ? ? ?_/_/_/ ? ? ?_/_/ ?_/_/
> "El amor es poner tu felicidad en la felicidad de otro" - Leibniz
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iQCVAwUBTBf21Jlgi5GaxT1NAQLPngP+NfLf7js3ni9FvoDjkrzOB0AmRIyfmDJm
> tm0wNEVIlTY+d3st76Gd62ET+VxtgNHfWyNQ82Zp0iAISoWlpDyflJlZ1r5oVjAR
> sWOSntdXXZAaaxOkumggi1cHKVCbWAe+62fGctTLWt4QtP4557yJDHZO1LKp1nWe
> qtHX5LyUD5k=
> =yGPk
> -----END PGP SIGNATURE-----
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>

From ziade.tarek at gmail.com  Wed Jun 16 00:34:22 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Wed, 16 Jun 2010 00:34:22 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <4C17F6D4.2050504@jcea.es>
References: <4C1768AF.9040606@egenix.com> <4C17B6CE.20209@jcea.es>
	<4C17BC38.6090208@egenix.com> <4C17C4B5.3000801@jcea.es>
	<AANLkTik-1Z6vecE5b83fKNA-qo687EiQUQDq5CIM5Oo5@mail.gmail.com>
	<4C17F6D4.2050504@jcea.es>
Message-ID: <AANLkTilRedZXp2bahx2DdSXtPva2e6-oejW-2AFtnPdj@mail.gmail.com>

On Tue, Jun 15, 2010 at 11:55 PM, Jesus Cea <jcea at jcea.es> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 15/06/10 20:52, Tarek Ziad? wrote:
>> Do you trust the package you are installing more than an "official"
>> mirror ? if so, why ?
>
> If a package is signed by the author, I only need to "trust" the author.
>
> If a package is not signed in PYPI, I must "trust" the author, PYPI
> admins and pypi machines security.
>
> If I download from a mirror, with no digital signature, I must trust the
> author, PYPI admins, pypi machines security, mirror admins, mirror
> machine security and mirror replication protocol. And all network
> connections and harddisks in between.
>
> It is just me, call me paranoid, but I pay close attention to where the
> package being installed by "easy_install" is pulled from. I have
> documented where each package used to live and I check carefully when I
> see an unexpected URL. And I freak out when I package upgrade includes
> new dependencies I haven't seen before.

Makes sense.

>
>> Anyone can upload a package at PyPI with
>>
>> ? os.system('rm -rf /')
>>
>> in its setup.py...
>
> True. And SCARY. Fortunatelly I only install packages I am interested
> in, check signatures, etc. Of course, I can be hacked if the original
> autor put a trojan in the package, or he/she was hacked before. But my
> exposure is smaller that if I must trust too every link in a LONG chain
> of mirrors.
>
> Just check his link, for a recent example:
>
> <http://it.slashdot.org/firehose.pl?op=view&type=story&sid=10/06/13/0046256>
>
> The trojan was not in the original sourcecode, but in an altered mirror
> version.
>
> Asking for pypi central node to add signatures is a trivial way of
> avoiding this issue. The question is not to trust or not to trust
> mirrors, but that we have technology to be safe even if the mirrors are
> not trusted. I don't NEED to trust you to be safe. I am happy!.

Sure, the ultimate solution are signatures, and I have forgotten that Martin
had work on this last year.

My opinion is just that until it's available and used, all PyPI
mirrors maintained by
people that are known members of the community are of a limited risk.

From fdrake at acm.org  Wed Jun 16 00:37:11 2010
From: fdrake at acm.org (Fred Drake)
Date: Tue, 15 Jun 2010 18:37:11 -0400
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <201006160824.23449.steve@pearwood.info>
References: <4C1768AF.9040606@egenix.com>
	<AANLkTikp7gTnJAXxCrXbwxf0C4Lz2h4ka8Udlw0r4tgO@mail.gmail.com> 
	<4C17BBE5.4010901@jcea.es> <201006160824.23449.steve@pearwood.info>
Message-ID: <AANLkTinE-JiyR93TT2mA1PV0LebxZ4QKi4AHzN2OvVOl@mail.gmail.com>

On Tue, Jun 15, 2010 at 6:24 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> A digital signature is not an MD5 checksum, it may have actual legal
> meaning in many countries equivalent to a pen and paper signature.

I would expect that verifying a package was signed by PyPI to mean no more than
that the bits match what's available from PyPI for the same name.  (Not sure if
that's what's in the PEP, but that's what I'd be looking for.)

We'd have to disclaim anything more than that.  But it would be useful to verify
that a package from a mirror was accurately mirrored.


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at gmail.com>
"Chaos is the score upon which reality is written." --Henry Miller

From ziade.tarek at gmail.com  Wed Jun 16 00:38:59 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Wed, 16 Jun 2010 00:38:59 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <4C17FBE9.8040400@v.loewis.de>
References: <4C1768AF.9040606@egenix.com>
	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>
	<4C17A419.4060602@egenix.com>
	<AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com>
	<4C17B9B2.10006@egenix.com>
	<AANLkTik_MDSpgfgxFaEzomiDTWhcb2aBu3eUahRVf8V7@mail.gmail.com>
	<4C17DF0A.3090008@egenix.com>
	<AANLkTimaEw7oK4JmLIK3EcrQyy5CNveDT13lO-Fxdwu4@mail.gmail.com>
	<4C17EFA3.6050204@egenix.com>
	<AANLkTikVb7NPEmC8MYjvSBagjHeKVSoANmxZFpe76_7Y@mail.gmail.com>
	<4C17FBE9.8040400@v.loewis.de>
Message-ID: <AANLkTil_ArjbhqG4_ECRZMzI7mmAlFPRG4pBMAnn5Lok@mail.gmail.com>

2010/6/16 "Martin v. L?wis" <martin at v.loewis.de>:
>> What's important also, is to make sure z3c.pypimirror includes the
>> server-side work, so existing mirrors can be upgraded.
>
> Not really. z3c.pypimirror has a completely different function.

It's a mirroring script for PyPI. Why do you say it has a completely
different function ?

> Operators
> providing one of the official PyPI mirrors should use pep381client instead.
>
> Of course, if people absolutely want to, they could also put PEP 381 support
> in z3c.pypimirror, but that may result in a significant rewrite.

What I had in mind was using pep381client within  z3c.pypimirror.

From jcea at jcea.es  Wed Jun 16 00:39:22 2010
From: jcea at jcea.es (Jesus Cea)
Date: Wed, 16 Jun 2010 00:39:22 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <201006160824.23449.steve@pearwood.info>
References: <4C1768AF.9040606@egenix.com>	<AANLkTikp7gTnJAXxCrXbwxf0C4Lz2h4ka8Udlw0r4tgO@mail.gmail.com>	<4C17BBE5.4010901@jcea.es>
	<201006160824.23449.steve@pearwood.info>
Message-ID: <4C18011A.3000202@jcea.es>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 16/06/10 00:24, Steven D'Aprano wrote:
> I would not be digitally signing anything I didn't create unless I had 
> good legal advice that it was safe to do so.

The pypi signature certifies that the package has not been tampered
with. It DO NOT certify anything else.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at jcea.es - http://www.jcea.es/     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at jabber.org         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBTBgBGplgi5GaxT1NAQL/fAP/a1GAtmt9kkVzMBiKA7G1hYZ6BG7bOdt0
D3+q5ces91uk6lmzU+HZXCl4pfCljCMYsQjuKa1EP6aNGOT/beAr35s7K2+4S/FE
FjBwchWe5YJaJY7gaMUoWakf0Dz9x4rgebd/Aa2a2Qi14fuA2JJyeOzrIcwgRfwQ
wgNq65M3ke8=
=IgAh
-----END PGP SIGNATURE-----

From fdrake at acm.org  Wed Jun 16 00:37:11 2010
From: fdrake at acm.org (Fred Drake)
Date: Tue, 15 Jun 2010 18:37:11 -0400
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <201006160824.23449.steve@pearwood.info>
References: <4C1768AF.9040606@egenix.com>
	<AANLkTikp7gTnJAXxCrXbwxf0C4Lz2h4ka8Udlw0r4tgO@mail.gmail.com> 
	<4C17BBE5.4010901@jcea.es> <201006160824.23449.steve@pearwood.info>
Message-ID: <AANLkTinE-JiyR93TT2mA1PV0LebxZ4QKi4AHzN2OvVOl@mail.gmail.com>

On Tue, Jun 15, 2010 at 6:24 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> A digital signature is not an MD5 checksum, it may have actual legal
> meaning in many countries equivalent to a pen and paper signature.

I would expect that verifying a package was signed by PyPI to mean no more than
that the bits match what's available from PyPI for the same name.  (Not sure if
that's what's in the PEP, but that's what I'd be looking for.)

We'd have to disclaim anything more than that.  But it would be useful to verify
that a package from a mirror was accurately mirrored.


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at gmail.com>
"Chaos is the score upon which reality is written." --Henry Miller

From martin at v.loewis.de  Wed Jun 16 00:45:26 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 16 Jun 2010 00:45:26 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <201006160824.23449.steve@pearwood.info>
References: <4C1768AF.9040606@egenix.com>	<AANLkTikp7gTnJAXxCrXbwxf0C4Lz2h4ka8Udlw0r4tgO@mail.gmail.com>	<4C17BBE5.4010901@jcea.es>
	<201006160824.23449.steve@pearwood.info>
Message-ID: <4C180286.1060807@v.loewis.de>

> I would not be digitally signing anything I didn't create unless I had
> good legal advice that it was safe to do so.

I'm actually not worried about this. In my own country, a valid digital 
signature requires much more than invocation of the RSA algorithm. E.g.
available of certain certified information about the key holder is 
necessary (including some identification of the key holder). The PyPI
signatures don't include any identification information.

Also, the only thing that *does* get signed are the simple index pages, 
and indeed, I not only sign them, I also generate them.

Regards,
Martin

From ianb at colorstudy.com  Wed Jun 16 00:47:57 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 15 Jun 2010 17:47:57 -0500
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <4C1768AF.9040606@egenix.com>
References: <4C1768AF.9040606@egenix.com>
Message-ID: <AANLkTik5IparaL64VJbcRuaQAgUKOq2Byliqhx5Hrlgf@mail.gmail.com>

Hmm... long thread.

Anyway: I'm +1 on using a CDN.  I think the overhead of managing a mirror
network is considerably greater than the cost of the CDN, and more
error-prone.  With a CDN one developer can figure out how to implement this
in PyPI, and any problems will be with PyPI, not some other mirror system
that the person debugging the problem doesn't control.

I think your cost only covers bandwidth, but there are also storage costs.
What disk space are the PyPI packages using right now?  That will only
increase over time as PyPI generally keeps all releases.  Possibly CDN space
could be donated.  As an implementation note, Google's new system copies
S3's API (http://code.google.com/apis/storage/) -- I'm not sure if it covers
the same territory as CloudFront though.  Anyway, implementing to
S3/CloudFront probably is a good bet even if the provider changes in the
future.

For generation /simple/ with a cronjob, I'm -0.  I find these delays make
testing difficult and unreliable; you can never be sure if the job is just
slow, what you did didn't work, etc.  I'd rather see PyPI shift to creating
static pages on-demand, that is, anytime they need updating.  Then if PyPI
goes down the static pages still exist and work, but there's no delay.
Another option might be a caching proxy configured to serve up cached copies
when the underlying system is down... but I'm not sure if that's any less
work ultimately, and is more ongoing administration.

I don't see a benefit to moving further into the cloud, such as hosting on
multiple machines.  I suspect that PyPI is not anywhere near needing more
power than a good sized server can provide, and I doubt that will change
soon.  It will be easier to manage the system with a single machine and
database.  There won't be network problems where app servers can't access
the database, for instance.  Or a need for replication, which is another big
potential administration hassle.


>  * scalability
>  * 24/7 system administration management
>  * geo-localized fast and reliable access
>
>
> Current Situation
> -----------------
>
> PyPI is currently run from a single server hosted in The Netherlands
> (ximinez.python.org).  This server is run by a very small team of sys
> admin.
>

As far as I know, none of this changes how much administration load there
is, does it?  That is, cloud machines still need to be administered.  The
only way I see that you'd really decrease administration load is with a more
radical move to a managed service, like App Engine.  That's probably quite
doable and would have substantial advantages, but it feels like a quite
different approach than is proposed here and it involves lots more coding.

Unless there really is a problem with the physical management of the server?

Server side: upload cronjobs
> ----------------------------
>
> Since the /simple index tree is currently being created dynamically,
> we'd need to create static copies of it at regular intervals in order
> to upload the content to the S3 bucket. This can easily be done using
> tools such as wget or curl.
>
> Both the static copy of the /simple tree and the static files uploaded
> to /packages then need to be uploaded or updated in the S3 bucket by a
> cronjob running every 10-20 minutes.
>

Is it easy to sync something with S3?  It's easy to upload, delete, etc.,
but sync is rather different, no?  Not a big deal, just that changes would
have to be tracked if sync was not efficient.


> Server side: redirection setup
> ------------------------------
>
> Since PyPI wasn't designed to be put on a CDN, it mixes static file
> URL paths with dynamic access ones, e.g.
>
> dynamic:
>
>  http://pypi.python.org/pypi
>  (and a few others)
>
> static:
>
>  http://pypi.python.org/simple
>  http://pypi.python.org/packages
>
> To move part of the URL path tree to a CDN, which works based on
> domains, we will need to provide a URL redirection setup that
> redirects client side tools to the new location.
>

As far as I know /packages isn't accessed directly, but only from links from
/simple -- so if those links are updated everything should work.  Some
packages already aren't on PyPI, so there's no particular expectation about
hosting location.

If /simple/ is a set of static files hosted on ximinez, will it be reliable
enough?  Then no redirects will be required.  I don't know what exactly has
caused failures.  If it's networking then redirects would help.  If it's
services failing, then static files will solve it.  If it's the entire
machine getting wonky, e.g., if memory is exhausted... then quite possible
static files will help avoid those situations but it's not a guarantee.

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100615/c6a98bc6/attachment.html>

From martin at v.loewis.de  Wed Jun 16 00:55:41 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 16 Jun 2010 00:55:41 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <AANLkTilH12k89h5SwCW8CKUue3JASTESmzzxB_l-IOAU@mail.gmail.com>
References: <4C1768AF.9040606@egenix.com>
	<4C17B6CE.20209@jcea.es>	<4C17BC38.6090208@egenix.com>
	<4C17C4B5.3000801@jcea.es>	<AANLkTik-1Z6vecE5b83fKNA-qo687EiQUQDq5CIM5Oo5@mail.gmail.com>	<4C17F6D4.2050504@jcea.es>
	<AANLkTilH12k89h5SwCW8CKUue3JASTESmzzxB_l-IOAU@mail.gmail.com>
Message-ID: <4C1804ED.8030708@v.loewis.de>

> Is the plan to use what is proposed in
> http://mail.python.org/pipermail/catalog-sig/2009-March/002018.html in
> practice?

You mean, is it implemented and deployed? Sure - just try for yourself.

> Is more information available about this?

This is not a very specific question. The answer is certainly: yes, e.g.
the source code of PyPI.

> Does this protect against man-in-the-middle attacks?

Hmm. This is also not very specific. Sometimes yes, sometimes no.

It protects against men sitting in the middle of a package download, and
also against men sitting on a mirror (which are both in the middle 
between PyPI and the user).

It doesn't protect against men sitting in the middle of the serverkey 
download, or men sitting in the middle of a setuptools installation
process, or men sitting on PyPI itself (which would be in the middle 
between the package author and the user).

Regards,
Martin

From martin at v.loewis.de  Wed Jun 16 01:01:17 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 16 Jun 2010 01:01:17 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <AANLkTil_ArjbhqG4_ECRZMzI7mmAlFPRG4pBMAnn5Lok@mail.gmail.com>
References: <4C1768AF.9040606@egenix.com>	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>	<4C17A419.4060602@egenix.com>	<AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com>	<4C17B9B2.10006@egenix.com>	<AANLkTik_MDSpgfgxFaEzomiDTWhcb2aBu3eUahRVf8V7@mail.gmail.com>	<4C17DF0A.3090008@egenix.com>	<AANLkTimaEw7oK4JmLIK3EcrQyy5CNveDT13lO-Fxdwu4@mail.gmail.com>	<4C17EFA3.6050204@egenix.com>	<AANLkTikVb7NPEmC8MYjvSBagjHeKVSoANmxZFpe76_7Y@mail.gmail.com>	<4C17FBE9.8040400@v.loewis.de>
	<AANLkTil_ArjbhqG4_ECRZMzI7mmAlFPRG4pBMAnn5Lok@mail.gmail.com>
Message-ID: <4C18063D.7000708@v.loewis.de>

Am 16.06.2010 00:38, schrieb Tarek Ziad?:
> 2010/6/16 "Martin v. L?wis"<martin at v.loewis.de>:
>>> What's important also, is to make sure z3c.pypimirror includes the
>>> server-side work, so existing mirrors can be upgraded.
>>
>> Not really. z3c.pypimirror has a completely different function.
>
> It's a mirroring script for PyPI. Why do you say it has a completely
> different function ?

a) it's a selective mirror (IIUC); a PEP 381 mirror should be complete
b) it's also a superset-mirror, mirroring stuff that actually *isn't* on
    PyPI. This is not needed for PEP 381
c) it edits the simple index pages, thus breaking the page signature.

So it is really aimed at private mirrors (IIUC, that's also what it is 
used for), whereas PEP 381 is about public mirrors.

>> Operators
>> providing one of the official PyPI mirrors should use pep381client instead.
>>
>> Of course, if people absolutely want to, they could also put PEP 381 support
>> in z3c.pypimirror, but that may result in a significant rewrite.
>
> What I had in mind was using pep381client within  z3c.pypimirror.

Not sure how this would work - but you can certainly feel free to copy 
any code that you find useful into z3c.pypimirror.

Regards,
Martin


From martin at v.loewis.de  Wed Jun 16 01:04:36 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 16 Jun 2010 01:04:36 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <AANLkTinE-JiyR93TT2mA1PV0LebxZ4QKi4AHzN2OvVOl@mail.gmail.com>
References: <4C1768AF.9040606@egenix.com>	<AANLkTikp7gTnJAXxCrXbwxf0C4Lz2h4ka8Udlw0r4tgO@mail.gmail.com>
	<4C17BBE5.4010901@jcea.es> <201006160824.23449.steve@pearwood.info>
	<AANLkTinE-JiyR93TT2mA1PV0LebxZ4QKi4AHzN2OvVOl@mail.gmail.com>
Message-ID: <4C180704.9060008@v.loewis.de>

Am 16.06.2010 00:37, schrieb Fred Drake:
> On Tue, Jun 15, 2010 at 6:24 PM, Steven D'Aprano<steve at pearwood.info>  wrote:
>> A digital signature is not an MD5 checksum, it may have actual legal
>> meaning in many countries equivalent to a pen and paper signature.
>
> I would expect that verifying a package was signed by PyPI to mean no more than
> that the bits match what's available from PyPI for the same name.  (Not sure if
> that's what's in the PEP, but that's what I'd be looking for.)

It's indeed exactly that.

> We'd have to disclaim anything more than that.  But it would be useful to verify
> that a package from a mirror was accurately mirrored.

There are actually two layers here: one is to verify that the 
transmission was not faulty; for this, the md5sum that is already in the 
simple pages should be enough (and *please* don't tell me that md5 is 
broken).

Of course, an adversary could then try to modify the simple pages, 
that's what the actual signatures are for.

Regards,
Martin

From debatem1 at gmail.com  Wed Jun 16 01:33:03 2010
From: debatem1 at gmail.com (geremy condra)
Date: Tue, 15 Jun 2010 16:33:03 -0700
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <4C1804ED.8030708@v.loewis.de>
References: <4C1768AF.9040606@egenix.com> <4C17B6CE.20209@jcea.es>
	<4C17BC38.6090208@egenix.com> <4C17C4B5.3000801@jcea.es>
	<AANLkTik-1Z6vecE5b83fKNA-qo687EiQUQDq5CIM5Oo5@mail.gmail.com>
	<4C17F6D4.2050504@jcea.es>
	<AANLkTilH12k89h5SwCW8CKUue3JASTESmzzxB_l-IOAU@mail.gmail.com>
	<4C1804ED.8030708@v.loewis.de>
Message-ID: <AANLkTimRYb_72Miaya4uu1cUQKdWroXb7gxPhxpBH3ur@mail.gmail.com>

On Tue, Jun 15, 2010 at 3:55 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> Is the plan to use what is proposed in
>> http://mail.python.org/pipermail/catalog-sig/2009-March/002018.html in
>> practice?
>
> You mean, is it implemented and deployed? Sure - just try for yourself.
>
>> Is more information available about this?
>
> This is not a very specific question. The answer is certainly: yes, e.g.
> the source code of PyPI.
>
>> Does this protect against man-in-the-middle attacks?
>
> Hmm. This is also not very specific. Sometimes yes, sometimes no.
>
> It protects against men sitting in the middle of a package download, and
> also against men sitting on a mirror (which are both in the middle between
> PyPI and the user).
>
> It doesn't protect against men sitting in the middle of the serverkey
> download, or men sitting in the middle of a setuptools installation
> process, or men sitting on PyPI itself (which would be in the middle between
> the package author and the user).

I'm not clear on this and the document is a little vague, so perhaps
I should be perusing the source, but if you don't protect against a
serverkey MITM and you are supposed to update the serverkey any
time a signature doesn't match up, couldn't an attacker just MITM
you, produce a known bad signature, and then wait for you to
request a serverkey from them?

Geremy Condra

From martin at v.loewis.de  Wed Jun 16 08:09:58 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 16 Jun 2010 08:09:58 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <AANLkTimRYb_72Miaya4uu1cUQKdWroXb7gxPhxpBH3ur@mail.gmail.com>
References: <4C1768AF.9040606@egenix.com>	<4C17B6CE.20209@jcea.es>	<4C17BC38.6090208@egenix.com>	<4C17C4B5.3000801@jcea.es>	<AANLkTik-1Z6vecE5b83fKNA-qo687EiQUQDq5CIM5Oo5@mail.gmail.com>	<4C17F6D4.2050504@jcea.es>	<AANLkTilH12k89h5SwCW8CKUue3JASTESmzzxB_l-IOAU@mail.gmail.com>	<4C1804ED.8030708@v.loewis.de>
	<AANLkTimRYb_72Miaya4uu1cUQKdWroXb7gxPhxpBH3ur@mail.gmail.com>
Message-ID: <4C186AB6.2030407@v.loewis.de>

> I'm not clear on this and the document is a little vague, so perhaps
> I should be perusing the source, but if you don't protect against a
> serverkey MITM and you are supposed to update the serverkey any
> time a signature doesn't match up, couldn't an attacker just MITM
> you, produce a known bad signature, and then wait for you to
> request a serverkey from them?

That's true; transmission of the serverkey is not currently protected 
against MITM. How would you suggest to fix that?

As for perusing the source: the client behavior is not implemented yet, 
so there isn't really any source to check, yet.

Regards,
Martin


From martin at v.loewis.de  Wed Jun 16 08:40:40 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 16 Jun 2010 08:40:40 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <4C186AB6.2030407@v.loewis.de>
References: <4C1768AF.9040606@egenix.com>	<4C17B6CE.20209@jcea.es>	<4C17BC38.6090208@egenix.com>	<4C17C4B5.3000801@jcea.es>	<AANLkTik-1Z6vecE5b83fKNA-qo687EiQUQDq5CIM5Oo5@mail.gmail.com>	<4C17F6D4.2050504@jcea.es>	<AANLkTilH12k89h5SwCW8CKUue3JASTESmzzxB_l-IOAU@mail.gmail.com>	<4C1804ED.8030708@v.loewis.de>	<AANLkTimRYb_72Miaya4uu1cUQKdWroXb7gxPhxpBH3ur@mail.gmail.com>
	<4C186AB6.2030407@v.loewis.de>
Message-ID: <4C1871E8.9060503@v.loewis.de>

> That's true; transmission of the serverkey is not currently protected
> against MITM. How would you suggest to fix that?
>
> As for perusing the source: the client behavior is not implemented yet,
> so there isn't really any source to check, yet.

Following up to myself: The mirroring protocol doesn't really *need*
to protect against MITM. Communication with PyPI (e.g. package download) 
currently isn't protected against MITM, either, so the mirroring adds no 
new threat here. The protocol primarily protects against malicious 
mirror operators, and hacked mirrors.

With that, a simple solution might be to offer opt-out of serverkey
updates. Users that worry about MITM should manually install the 
serverkey in their pypirc, then distribute could refuse to automatically 
update it. In the case of key rollover, users would need to download the 
server key again in a trusted manner.

Regards,
Martin

From justinc at cs.washington.edu  Wed Jun 16 08:41:45 2010
From: justinc at cs.washington.edu (Justin Cappos)
Date: Tue, 15 Jun 2010 23:41:45 -0700
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <4C186AB6.2030407@v.loewis.de>
References: <4C1768AF.9040606@egenix.com> <4C17B6CE.20209@jcea.es>
	<4C17BC38.6090208@egenix.com> <4C17C4B5.3000801@jcea.es>
	<AANLkTik-1Z6vecE5b83fKNA-qo687EiQUQDq5CIM5Oo5@mail.gmail.com>
	<4C17F6D4.2050504@jcea.es>
	<AANLkTilH12k89h5SwCW8CKUue3JASTESmzzxB_l-IOAU@mail.gmail.com>
	<4C1804ED.8030708@v.loewis.de>
	<AANLkTimRYb_72Miaya4uu1cUQKdWroXb7gxPhxpBH3ur@mail.gmail.com>
	<4C186AB6.2030407@v.loewis.de>
Message-ID: <AANLkTikD4EiiOWcsRWWW6btxOQk2gO6EPs-Xa69fRPao@mail.gmail.com>

On Tue, Jun 15, 2010 at 11:09 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> I'm not clear on this and the document is a little vague, so perhaps
>> I should be perusing the source, but if you don't protect against a
>> serverkey MITM and you are supposed to update the serverkey any
>> time a signature doesn't match up, couldn't an attacker just MITM
>> you, produce a known bad signature, and then wait for you to
>> request a serverkey from them?
>
> That's true; transmission of the serverkey is not currently protected
> against MITM. How would you suggest to fix that?

A simple way to protect against just the issue you mentioned is to
have the clients retrieve the key over HTTPS or distribute the key
with the client.

In general, the problems are much, much trickier than just this.   I
won't bore you with all of the details (unless you'd like to know
more), but we found and fixed a lot of problems with the security of
linux package managers.   A quick pointer to some of the technical
details can be found here:
http://www.cs.arizona.edu/stork/packagemanagersecurity/papers.html

> As for perusing the source: the client behavior is not implemented yet, so
> there isn't really any source to check, yet.

Okay.   We'd be happy to work with you to get an easy solution put in
place.   As I was shamelessly plugging before, we've been working on a
library called TUF that is supposed to make this as simple as possible
for whomever maintains the repository and be completely transparent
for the clients.

TUF is fairly early stage (our first major deployment is on going),
but might be worth consideration.   I think we could probably put
together a quick demo so that you and others could see how it might
work with one of the existing client updaters.

Thanks,
Justin

From marrakis at gmail.com  Wed Jun 16 09:33:39 2010
From: marrakis at gmail.com (Mathieu Leduc-Hamel)
Date: Wed, 16 Jun 2010 09:33:39 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTilqUWQE4dpq8JNduHjhAnct5sgKGU8X3UNPhn69@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de> <4C12A54D.1070406@egenix.com>
	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>
	<4C14D8E8.4010903@egenix.com>
	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>
	<4C15F5F3.40501@egenix.com>
	<AANLkTineOzOP7Bb9JiAVPulkExOOyLjP8yHTLzmJMPkc@mail.gmail.com>
	<4C176BD4.3080909@egenix.com>
	<AANLkTilsh8ymbJ6JRzF3QxMe0Rv9K-x3_WKZ-wG4_uht@mail.gmail.com>
	<4C17CE55.5000601@v.loewis.de>
	<AANLkTikuYn0XoVHBCd4CKew5xvI8JJdXzKOEmjCXmn7W@mail.gmail.com>
	<AANLkTilqUWQE4dpq8JNduHjhAnct5sgKGU8X3UNPhn69@mail.gmail.com>
Message-ID: <AANLkTilxNdFZwKAzj3F5PBxc85QTYHmgPqaPA29Iq4a3@mail.gmail.com>

>
> Note that Martin is doing the final step (checking the changes before
> they go in production
> and updating the production server).
>

For sure ! I wasn't saying that I wanted to be able to push anything
directly on PyPi or on the official repository.

My point was more about making it easier for contributors to fork, modify
and proposed...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100616/5a4702e0/attachment.html>

From solipsis at pitrou.net  Wed Jun 16 13:44:56 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 16 Jun 2010 11:44:56 +0000 (UTC)
Subject: [Catalog-sig] Mercurial
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>	<4C12A2E4.2090305@v.loewis.de>	<4C12A54D.1070406@egenix.com>	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>	<4C14D8E8.4010903@egenix.com>	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>	<4C15F5F3.40501@egenix.com>	<AANLkTineOzOP7Bb9JiAVPulkExOOyLjP8yHTLzmJMPkc@mail.gmail.com>	<4C176BD4.3080909@egenix.com>	<AANLkTilsh8ymbJ6JRzF3QxMe0Rv9K-x3_WKZ-wG4_uht@mail.gmail.com>	<4C17CE55.5000601@v.loewis.de>
	<AANLkTilaoMa0OEkwcRzlL2ild5C4fGVf1hg-C5lsQmBu@mail.gmail.com>
	<4C17F065.7070309@v.loewis.de>
Message-ID: <loom.20100616T134349-986@post.gmane.org>

Martin v. L?wis <martin <at> v.loewis.de> writes:
> 
> > As a maintainer of the PyPI project, it makes your workflow simpler,
> >
> > - contributors can clone the repo, change the code and ask you for a pull
> > - you can pull changes by direct hg commands, and merge them
> 
> After using Mercurial in one project, I'm skeptical that this really 
> makes things simpler. I find it very hard to find out what changes a 
> specific clone has that I still need to integrate. Also, when merging 
> with conflicts, I find it very difficult to determine whether I merged 
> all the conflicts correctly (since the diff will show all changes, not 
> just the conflicts).
> 
> So I rather expect things to become more difficult when switching to hg.

I think it would be fair to bring those points on the mercurial mailing-list.
After all we'll be one of their "high-profile" users, so they'd probably like us
to enjoy the experience.

Regards

Antoine.



From solipsis at pitrou.net  Wed Jun 16 13:53:00 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 16 Jun 2010 11:53:00 +0000 (UTC)
Subject: [Catalog-sig]
	=?utf-8?q?Proposal=3A_Move_PyPI_static_data_to_the_?=
	=?utf-8?q?cloud_for=09better_availability?=
References: <4C1768AF.9040606@egenix.com>
	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>
	<4C17A419.4060602@egenix.com>
	<AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com>
	<A447F7AF-18D4-4E07-8F9B-BDA0E6BC92D2@mac.com>
	<AANLkTilm3RLO5Z5uJzCtjYUqmACLOl-jWucgQv13IicH@mail.gmail.com>
	<4C17BBC3.3050205@egenix.com>
	<AANLkTikXWnAlk_MVppqgCX6WhouDvqC0w8fPUJRpR6HK@mail.gmail.com>
Message-ID: <loom.20100616T135105-60@post.gmane.org>

Tarek Ziad? <ziade.tarek <at> gmail.com> writes:
> 
> And we happen to have this network already: lots of people
> will host a PyPI mirror as soon as it's easy to set one imho.

You must be careful that the mirrors are properly managed and administered,
though. Having stale/dysfunctioning mirrors is worse than having no mirrors at 
all.
It is likely that some people will setup a mirror and then "forget" to take care
about it. Like our buildbots really.

Regards

Antoine.



From solipsis at pitrou.net  Wed Jun 16 14:03:30 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 16 Jun 2010 12:03:30 +0000 (UTC)
Subject: [Catalog-sig] [OT] Nagios / Shinken
References: <4C1768AF.9040606@egenix.com>
	<201006160033.46095.steve@pearwood.info>
	<4C17A272.9070808@egenix.com>
Message-ID: <loom.20100616T135850-492@post.gmane.org>

M.-A. Lemburg <mal <at> egenix.com> writes:
> 
> Setting up some Zenoss or Nagios monitoring system to take
> care of monitoring the PyPI server (and our other servers)
> would be a separate project.

Just for the record, I would mention that someone started a rewrite of the
Nagios software in Python:
http://www.shinken-monitoring.org/

According to the author, the Python rewrite is also much faster than the
original C software:
http://www.shinken-monitoring.org/features/huge-performances/

Probably a good showcase of the "using a dynamic language allows you to focus on
a better architecture" argument :-)

Regards

Antoine.



From chrism at plope.com  Wed Jun 16 14:09:14 2010
From: chrism at plope.com (Chris McDonough)
Date: Wed, 16 Jun 2010 08:09:14 -0400
Subject: [Catalog-sig] [OT] Nagios / Shinken
In-Reply-To: <loom.20100616T135850-492@post.gmane.org>
References: <4C1768AF.9040606@egenix.com>
	<201006160033.46095.steve@pearwood.info> <4C17A272.9070808@egenix.com>
	<loom.20100616T135850-492@post.gmane.org>
Message-ID: <1276690154.2688.26.camel@thinko>

Even more OT, we might try setting up the PyPI server under supervisord
(http://supervisord.org) plus superlance's HTTPOK and memmon event
listeners.  This would make sure that the process is restarted when it
stops answering HTTP requests or if it begins to consume "too much"
memory.  It's slightly more reliable than other systems that do similar
things, because it's the parent process of the processes being
monitored.

On Wed, 2010-06-16 at 12:03 +0000, Antoine Pitrou wrote:
> M.-A. Lemburg <mal <at> egenix.com> writes:
> > 
> > Setting up some Zenoss or Nagios monitoring system to take
> > care of monitoring the PyPI server (and our other servers)
> > would be a separate project.
> 
> Just for the record, I would mention that someone started a rewrite of the
> Nagios software in Python:
> http://www.shinken-monitoring.org/
> 
> According to the author, the Python rewrite is also much faster than the
> original C software:
> http://www.shinken-monitoring.org/features/huge-performances/
> 
> Probably a good showcase of the "using a dynamic language allows you to focus on
> a better architecture" argument :-)
> 
> Regards
> 
> Antoine.
> 
> 
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
> 



From mal at egenix.com  Wed Jun 16 14:20:00 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 16 Jun 2010 14:20:00 +0200
Subject: [Catalog-sig] [OT] Nagios / Shinken
In-Reply-To: <loom.20100616T135850-492@post.gmane.org>
References: <4C1768AF.9040606@egenix.com>	<201006160033.46095.steve@pearwood.info>	<4C17A272.9070808@egenix.com>
	<loom.20100616T135850-492@post.gmane.org>
Message-ID: <4C18C170.7060102@egenix.com>

Antoine Pitrou wrote:
> M.-A. Lemburg <mal <at> egenix.com> writes:
>>
>> Setting up some Zenoss or Nagios monitoring system to take
>> care of monitoring the PyPI server (and our other servers)
>> would be a separate project.
> 
> Just for the record, I would mention that someone started a rewrite of the
> Nagios software in Python:
> http://www.shinken-monitoring.org/

> According to the author, the Python rewrite is also much faster than the
> original C software:
> http://www.shinken-monitoring.org/features/huge-performances/
> 
> Probably a good showcase of the "using a dynamic language allows you to focus on
> a better architecture" argument :-)

Zenoss is written in Python and uses Zope for the web GUI. It has
a large community around it and provides all the enterprise
features you'd need from such a system.

    http://www.zenoss.com/

and Zenoss can use Nagios plugins as well.

I'd see a chance for such a new tool, though: Zenoss can be very
complicated to setup, esp. if you're not using SNMP on all your
machines.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 16 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                32 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From mal at egenix.com  Wed Jun 16 14:20:09 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 16 Jun 2010 14:20:09 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <loom.20100616T135105-60@post.gmane.org>
References: <4C1768AF.9040606@egenix.com>	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>	<4C17A419.4060602@egenix.com>	<AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com>	<A447F7AF-18D4-4E07-8F9B-BDA0E6BC92D2@mac.com>	<AANLkTilm3RLO5Z5uJzCtjYUqmACLOl-jWucgQv13IicH@mail.gmail.com>	<4C17BBC3.3050205@egenix.com>	<AANLkTikXWnAlk_MVppqgCX6WhouDvqC0w8fPUJRpR6HK@mail.gmail.com>
	<loom.20100616T135105-60@post.gmane.org>
Message-ID: <4C18C179.4080709@egenix.com>

Antoine Pitrou wrote:
> Tarek Ziad? <ziade.tarek <at> gmail.com> writes:
>>
>> And we happen to have this network already: lots of people
>> will host a PyPI mirror as soon as it's easy to set one imho.
> 
> You must be careful that the mirrors are properly managed and administered,
> though. Having stale/dysfunctioning mirrors is worse than having no mirrors at 
> all.
> It is likely that some people will setup a mirror and then "forget" to take care
> about it. Like our buildbots really.

Right, it's that administration overhead I was referring to.

Perhaps we should just let the users decide:

a) they use the default PyPI access (which we then enhance by
   caching the content in the cloud)

b) they setup their easy_install or zc.buildout to pull data
   from a mirror network by enabling a configuration option

Since implementing option b) will require updating existing
package tools on the client side anyway, the extra configuration
shouldn't be a problem.

Option a) requires no changes whatsoever on the client side.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 16 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                32 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From jacob at jacobian.org  Wed Jun 16 18:39:32 2010
From: jacob at jacobian.org (Jacob Kaplan-Moss)
Date: Wed, 16 Jun 2010 11:39:32 -0500
Subject: [Catalog-sig] Renaming packages
Message-ID: <AANLkTilc7Me4bYGG1Gp3X1A8nCqKqHSynEO1FiZwecrb@mail.gmail.com>

Howdy folks --

I've received a request from the Debian and Ubuntu maintainers to
rename one of my packages [1] so that it'd comply better with the
Debian/Ubuntu naming standards. I'd like to help them out, and ideally
I'd like to rename my package on PyPI to match the name that APT will
use. However, as far as I can tell there's no real mechanism for
renaming packages on PyPI: if I change the name, everyone's
pip/buildout dependencies will just fail until they, too, update the
name.

Ideally, I'd expect PyPI to give me a renaming mechanism that'd issue
the proper redirects from the old name to the new. Apologies if I'm
just not seeing a feature that's already there; if it's not, though,
are there any plans for this in the future? Or any other bright ideas?

Thanks!

Jacob

[1] http://pypi.python.org/pypi/python-cloudservers

From sridharr at activestate.com  Wed Jun 16 19:06:58 2010
From: sridharr at activestate.com (Sridhar Ratnakumar)
Date: Wed, 16 Jun 2010 10:06:58 -0700
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <4C12A2E4.2090305@v.loewis.de>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de>
Message-ID: <362E7782-303B-4ED1-803A-EA82762F6365@activestate.com>


On 2010-06-11, at 1:56 PM, Martin v. L?wis wrote:

> If you are willing to invest *a lot* of time, then it seems that rewriting PyPI in Django would make a lot of people happy, because
> they claim they can't contribute to the current code base because
> they don't understand that. I don't want to do such a rewrite on
> my own because I *do* understand the code base (despite not having written it in the first place, so I think that if you really want
> to contribute, you can learn how it works); it also violates Joel
> Spolsky's principle of never ever doing rewrites.

FYI: I just happened to stumble upon what claims to be a "re-implementation of PyPI" in Django:
http://pypi.python.org/pypi/djangopypi/0.4

-srid

From debatem1 at gmail.com  Wed Jun 16 19:42:25 2010
From: debatem1 at gmail.com (geremy condra)
Date: Wed, 16 Jun 2010 13:42:25 -0400
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <AANLkTikD4EiiOWcsRWWW6btxOQk2gO6EPs-Xa69fRPao@mail.gmail.com>
References: <4C1768AF.9040606@egenix.com> <4C17B6CE.20209@jcea.es>
	<4C17BC38.6090208@egenix.com> <4C17C4B5.3000801@jcea.es>
	<AANLkTik-1Z6vecE5b83fKNA-qo687EiQUQDq5CIM5Oo5@mail.gmail.com>
	<4C17F6D4.2050504@jcea.es>
	<AANLkTilH12k89h5SwCW8CKUue3JASTESmzzxB_l-IOAU@mail.gmail.com>
	<4C1804ED.8030708@v.loewis.de>
	<AANLkTimRYb_72Miaya4uu1cUQKdWroXb7gxPhxpBH3ur@mail.gmail.com>
	<4C186AB6.2030407@v.loewis.de>
	<AANLkTikD4EiiOWcsRWWW6btxOQk2gO6EPs-Xa69fRPao@mail.gmail.com>
Message-ID: <AANLkTil7oAFuG1aMfA2WEIFAVeniBoUFXUPyY7pFUfrs@mail.gmail.com>

On Wed, Jun 16, 2010 at 2:41 AM, Justin Cappos
<justinc at cs.washington.edu> wrote:
> On Tue, Jun 15, 2010 at 11:09 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>>> I'm not clear on this and the document is a little vague, so perhaps
>>> I should be perusing the source, but if you don't protect against a
>>> serverkey MITM and you are supposed to update the serverkey any
>>> time a signature doesn't match up, couldn't an attacker just MITM
>>> you, produce a known bad signature, and then wait for you to
>>> request a serverkey from them?
>>
>> That's true; transmission of the serverkey is not currently protected
>> against MITM. How would you suggest to fix that?
>
> A simple way to protect against just the issue you mentioned is to
> have the clients retrieve the key over HTTPS or distribute the key
> with the client.

I'd just add that this is not currently as simple as it should be in
Python; by default Python does not check certs for HTTPS
connections, so you can't just feed the correct url into urllib and
be sure you're getting the right answer.

http://bugs.python.org/issue1589

Geremy Condra

From martin at v.loewis.de  Wed Jun 16 20:37:37 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 16 Jun 2010 20:37:37 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <loom.20100616T135105-60@post.gmane.org>
References: <4C1768AF.9040606@egenix.com>	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>	<4C17A419.4060602@egenix.com>	<AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com>	<A447F7AF-18D4-4E07-8F9B-BDA0E6BC92D2@mac.com>	<AANLkTilm3RLO5Z5uJzCtjYUqmACLOl-jWucgQv13IicH@mail.gmail.com>	<4C17BBC3.3050205@egenix.com>	<AANLkTikXWnAlk_MVppqgCX6WhouDvqC0w8fPUJRpR6HK@mail.gmail.com>
	<loom.20100616T135105-60@post.gmane.org>
Message-ID: <4C1919F1.9080506@v.loewis.de>

Am 16.06.2010 13:53, schrieb Antoine Pitrou:
> Tarek Ziad?<ziade.tarek<at>  gmail.com>  writes:
>>
>> And we happen to have this network already: lots of people
>> will host a PyPI mirror as soon as it's easy to set one imho.
>
> You must be careful that the mirrors are properly managed and administered,
> though. Having stale/dysfunctioning mirrors is worse than having no mirrors at
> all.

That's not true. The client software can check whether a mirror is 
up-to-date, and proceed to the next mirror if one is outdated.

> It is likely that some people will setup a mirror and then "forget" to take care
> about it. Like our buildbots really.

The same can happen to any infrastructure, though. Amazon may decide to 
change the setup, and then the automated update procedure would break.
Of course, they would give advance notice - but then somebody would
have to react to that advance notice.

With the proposed default redirection of all PyPI downloads to Amazon,
such breakage would affect the entire installation, not just a single 
mirror.

Regards,
Martin

From martin at v.loewis.de  Wed Jun 16 20:40:18 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 16 Jun 2010 20:40:18 +0200
Subject: [Catalog-sig] Mercurial
In-Reply-To: <loom.20100616T134349-986@post.gmane.org>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>	<4C12A2E4.2090305@v.loewis.de>	<4C12A54D.1070406@egenix.com>	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>	<4C14D8E8.4010903@egenix.com>	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>	<4C15F5F3.40501@egenix.com>	<AANLkTineOzOP7Bb9JiAVPulkExOOyLjP8yHTLzmJMPkc@mail.gmail.com>	<4C176BD4.3080909@egenix.com>	<AANLkTilsh8ymbJ6JRzF3QxMe0Rv9K-x3_WKZ-wG4_uht@mail.gmail.com>	<4C17CE55.5000601@v.loewis.de>	<AANLkTilaoMa0OEkwcRzlL2ild5C4fGVf1hg-C5lsQmBu@mail.gmail.com>	<4C17F065.7070309@v.loewis.de>
	<loom.20100616T134349-986@post.gmane.org>
Message-ID: <4C191A92.9030404@v.loewis.de>

Am 16.06.2010 13:44, schrieb Antoine Pitrou:
> Martin v. L?wis<martin<at>  v.loewis.de>  writes:
>>
>>> As a maintainer of the PyPI project, it makes your workflow simpler,
>>>
>>> - contributors can clone the repo, change the code and ask you for a pull
>>> - you can pull changes by direct hg commands, and merge them
>>
>> After using Mercurial in one project, I'm skeptical that this really
>> makes things simpler. I find it very hard to find out what changes a
>> specific clone has that I still need to integrate. Also, when merging
>> with conflicts, I find it very difficult to determine whether I merged
>> all the conflicts correctly (since the diff will show all changes, not
>> just the conflicts).
>>
>> So I rather expect things to become more difficult when switching to hg.
>
> I think it would be fair to bring those points on the mercurial mailing-list.
> After all we'll be one of their "high-profile" users, so they'd probably like us
> to enjoy the experience.

I'm just a hg beginner, so it's probably all my fault, and I'm not using 
it correctly.

However, I admit that switching from RCS to CVS was easy, and so was 
switching from CVS to SVN. Switching to hg is the most difficult change 
for me. I'm probably getting old.

Regards,
Martin

From martin at v.loewis.de  Wed Jun 16 20:56:06 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 16 Jun 2010 20:56:06 +0200
Subject: [Catalog-sig] Renaming packages
In-Reply-To: <AANLkTilc7Me4bYGG1Gp3X1A8nCqKqHSynEO1FiZwecrb@mail.gmail.com>
References: <AANLkTilc7Me4bYGG1Gp3X1A8nCqKqHSynEO1FiZwecrb@mail.gmail.com>
Message-ID: <4C191E46.1060602@v.loewis.de>

Am 16.06.2010 18:39, schrieb Jacob Kaplan-Moss:
> Howdy folks --
>
> I've received a request from the Debian and Ubuntu maintainers to
> rename one of my packages [1] so that it'd comply better with the
> Debian/Ubuntu naming standards. I'd like to help them out, and ideally
> I'd like to rename my package on PyPI to match the name that APT will
> use. However, as far as I can tell there's no real mechanism for
> renaming packages on PyPI: if I change the name, everyone's
> pip/buildout dependencies will just fail until they, too, update the
> name.
>
> Ideally, I'd expect PyPI to give me a renaming mechanism that'd issue
> the proper redirects from the old name to the new. Apologies if I'm
> just not seeing a feature that's already there; if it's not, though,
> are there any plans for this in the future? Or any other bright ideas?

There is a renaming mechanism, but it does just that: rename the 
package, and all releases. Also, it's available only to the admin, so 
you have to request it through the bug tracker.

It turns out that this actually causes problems (beyond the 
dependencies): the files are *not* renamed, and that is, at least, 
confusing (because they stop matching the project name). Renaming
the files is no option, either, because they then stop matching
the embedded setup.py.

I think your proposed mechanism wouldn't work too well, either: if you 
issue redirects, then setuptools will follow the redirects, too. 
Depending on the package name you originally requested, it will then 
fail to see either the old files or the new files, since they don't 
match the project name.

So I think the best you can hope for is this:
- you have the old releases, and they are easy_installable only
   with the old name.
- you have the new releases, and they are easy_installable only with
   the new name.

If that's all you can get, I suggest just to create the new package, and 
release under the new package name. For human users of the package 
index, create a single release of the old package, with a description 
that has a link to the new name.

Regards,
Martin



From fdrake at acm.org  Wed Jun 16 21:14:32 2010
From: fdrake at acm.org (Fred Drake)
Date: Wed, 16 Jun 2010 15:14:32 -0400
Subject: [Catalog-sig] Mercurial
In-Reply-To: <4C191A92.9030404@v.loewis.de>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com> 
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com> 
	<4C12A2E4.2090305@v.loewis.de> <4C12A54D.1070406@egenix.com> 
	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com> 
	<4C14D8E8.4010903@egenix.com>
	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com> 
	<4C15F5F3.40501@egenix.com>
	<AANLkTineOzOP7Bb9JiAVPulkExOOyLjP8yHTLzmJMPkc@mail.gmail.com> 
	<4C176BD4.3080909@egenix.com>
	<AANLkTilsh8ymbJ6JRzF3QxMe0Rv9K-x3_WKZ-wG4_uht@mail.gmail.com> 
	<4C17CE55.5000601@v.loewis.de>
	<AANLkTilaoMa0OEkwcRzlL2ild5C4fGVf1hg-C5lsQmBu@mail.gmail.com> 
	<4C17F065.7070309@v.loewis.de>
	<loom.20100616T134349-986@post.gmane.org> 
	<4C191A92.9030404@v.loewis.de>
Message-ID: <AANLkTilZv_WIrlod5MDYArx3XNWMwhKwWqz7X5KnjyAb@mail.gmail.com>

On Wed, Jun 16, 2010 at 2:40 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> However, I admit that switching from RCS to CVS was easy, and so was
> switching from CVS to SVN. Switching to hg is the most difficult change for
> me. I'm probably getting old.

Pretty much the same here; DVCS systems have some highly desirable features
(better merging), but there's a lot of other changes to learn before they can
be used effectively.


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at gmail.com>
"A storm broke loose in my mind."  --Albert Einstein

From tjreedy at udel.edu  Wed Jun 16 21:27:09 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 16 Jun 2010 15:27:09 -0400
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <4C18C179.4080709@egenix.com>
References: <4C1768AF.9040606@egenix.com>	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>	<4C17A419.4060602@egenix.com>	<AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com>	<A447F7AF-18D4-4E07-8F9B-BDA0E6BC92D2@mac.com>	<AANLkTilm3RLO5Z5uJzCtjYUqmACLOl-jWucgQv13IicH@mail.gmail.com>	<4C17BBC3.3050205@egenix.com>	<AANLkTikXWnAlk_MVppqgCX6WhouDvqC0w8fPUJRpR6HK@mail.gmail.com>	<loom.20100616T135105-60@post.gmane.org>
	<4C18C179.4080709@egenix.com>
Message-ID: <hvb8ic$4s2$1@dough.gmane.org>

On 6/16/2010 8:20 AM, M.-A. Lemburg wrote:
> Antoine Pitrou wrote:
>> Tarek Ziad?<ziade.tarek<at>  gmail.com>  writes:
>>>
>>> And we happen to have this network already: lots of people
>>> will host a PyPI mirror as soon as it's easy to set one imho.
>>
>> You must be careful that the mirrors are properly managed and administered,
>> though. Having stale/dysfunctioning mirrors is worse than having no mirrors at
>> all.
>> It is likely that some people will setup a mirror and then "forget" to take care
>> about it. Like our buildbots really.
>
> Right, it's that administration overhead I was referring to.
>
> Perhaps we should just let the users decide:
>
> a) they use the default PyPI access (which we then enhance by
>     caching the content in the cloud)
>
> b) they setup their easy_install or zc.buildout to pull data
>     from a mirror network by enabling a configuration option
>
> Since implementing option b) will require updating existing
> package tools on the client side anyway, the extra configuration
> shouldn't be a problem.
>
> Option a) requires no changes whatsoever on the client side.

It seems to me that:

If the problem of availability with pypi is anything like the problems 
with bugs... and extending 'pypi...' with cloud service could be done 
relatively quickly (within a month), then that seems reasonable.

If 'free to psf' mirrors are feasible and needed, then they will still 
be useful, especially is high-download regions. Since Amazon's cloud 
service is metered on a region by region basis, any off-loading of 
demand to regional mirrors will reduce PSF charges. Based on what I have 
read in the thread, I would not be surprised if full mirror deployment 
takes a year. After that, the cloud service could remain to pick up 
slack in a region should the mirror in a region go down.

Any move to incremental update from time-based replacement will benefit 
either system.

Terry Jan Reedy





From solipsis at pitrou.net  Wed Jun 16 20:41:55 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 16 Jun 2010 20:41:55 +0200
Subject: [Catalog-sig] Mercurial
In-Reply-To: <4C191A92.9030404@v.loewis.de>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de>	<4C12A54D.1070406@egenix.com>
	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>
	<4C14D8E8.4010903@egenix.com>
	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>
	<4C15F5F3.40501@egenix.com>
	<AANLkTineOzOP7Bb9JiAVPulkExOOyLjP8yHTLzmJMPkc@mail.gmail.com>
	<4C176BD4.3080909@egenix.com>
	<AANLkTilsh8ymbJ6JRzF3QxMe0Rv9K-x3_WKZ-wG4_uht@mail.gmail.com>
	<4C17CE55.5000601@v.loewis.de>
	<AANLkTilaoMa0OEkwcRzlL2ild5C4fGVf1hg-C5lsQmBu@mail.gmail.com>
	<4C17F065.7070309@v.loewis.de>
	<loom.20100616T134349-986@post.gmane.org>
	<4C191A92.9030404@v.loewis.de>
Message-ID: <1276713715.3174.0.camel@localhost.localdomain>

Le mercredi 16 juin 2010 ? 20:40 +0200, "Martin v. L?wis" a ?crit :
> Am 16.06.2010 13:44, schrieb Antoine Pitrou:
> > Martin v. L?wis<martin<at>  v.loewis.de>  writes:
> >>
> >>> As a maintainer of the PyPI project, it makes your workflow simpler,
> >>>
> >>> - contributors can clone the repo, change the code and ask you for a pull
> >>> - you can pull changes by direct hg commands, and merge them
> >>
> >> After using Mercurial in one project, I'm skeptical that this really
> >> makes things simpler. I find it very hard to find out what changes a
> >> specific clone has that I still need to integrate. Also, when merging
> >> with conflicts, I find it very difficult to determine whether I merged
> >> all the conflicts correctly (since the diff will show all changes, not
> >> just the conflicts).
> >>
> >> So I rather expect things to become more difficult when switching to hg.
> >
> > I think it would be fair to bring those points on the mercurial mailing-list.
> > After all we'll be one of their "high-profile" users, so they'd probably like us
> > to enjoy the experience.
> 
> I'm just a hg beginner, so it's probably all my fault, and I'm not using 
> it correctly.

There's no problem in asking beginner questions :)

Regards

Antoine.



From merwok at netwok.org  Wed Jun 16 21:53:11 2010
From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Wed, 16 Jun 2010 21:53:11 +0200
Subject: [Catalog-sig] Mercurial
In-Reply-To: <4C17F065.7070309@v.loewis.de>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>	<4C12A2E4.2090305@v.loewis.de>	<4C12A54D.1070406@egenix.com>	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>	<4C14D8E8.4010903@egenix.com>	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>	<4C15F5F3.40501@egenix.com>	<AANLkTineOzOP7Bb9JiAVPulkExOOyLjP8yHTLzmJMPkc@mail.gmail.com>	<4C176BD4.3080909@egenix.com>	<AANLkTilsh8ymbJ6JRzF3QxMe0Rv9K-x3_WKZ-wG4_uht@mail.gmail.com>	<4C17CE55.5000601@v.loewis.de>	<AANLkTilaoMa0OEkwcRzlL2ild5C4fGVf1hg-C5lsQmBu@mail.gmail.com>
	<4C17F065.7070309@v.loewis.de>
Message-ID: <4C192BA7.8010202@netwok.org>

> After using Mercurial in one project, I'm skeptical that this really 
> makes things simpler. I find it very hard to find out what changes a 
> specific clone has that I still need to integrate.

There are commands to compare repositories: incoming and outgoing (read
?hg help incoming?).

> Also, when merging with conflicts, I find it very difficult to determine
> whether I merged all the conflicts correctly (since the diff will show
> all changes, not just the conflicts).

I believe that?s a known bug. David Wolever is writing an extension to
show only the diff against the automated merge, which would be more
helpful: http://mercurial.selenic.com/wiki/MergediffExtension
Bitbucket uses a similar algo to display merge diffs, I think.

With the command-line tool or TortoiseHg, you can check the diff against
the second parent of the merge, which can be more meaningful than the
default diff against the first parent.

We will definitely need tutorials to make the transition smooth. More
advanced users are on python-dev and #python-dev and bugs.python.org to
help newcomers, so don?t hesitate to complain about Mercurial.

BTW, the willingness to learn a new tool in such a fundamental area as
version control tells you?re not so old. <wink> hginit.com is a really
short tutorial that starts with version control reeducation for
Subversion users.

Regards


From simon at ikanobori.jp  Wed Jun 16 23:15:56 2010
From: simon at ikanobori.jp (Simon de Vlieger)
Date: Wed, 16 Jun 2010 23:15:56 +0200
Subject: [Catalog-sig] PyPI template improvements
Message-ID: <AAD06C81-9B08-4C51-87CD-4C681D91A09F@ikanobori.jp>

Hey all,

the recent activity on this mailinglist has kickstarted my  
contributing sense. As long as the mirroring debate is still ongoing I  
will focus my efforts somewhere else. Namely: the HTML/Javascript/CSS.  
This email has also been submitted to the distutils-sig list as a lot  
of power users of PyPI are on there.

In this regard I have a few questions before I really dig into these  
templates:

- - Is there a list of improvements, maybe a nice TODO of points which  
people want to see improved?
- - How are design changes handled, is there a committee to run them  
through? People who decide on what gets in and what not? (I'll outline  
some of my first thoughts lower in this mail)
- - What are the supported browser versions by PyPI, I reckon it's  
IE6/7/8+, Fx 2+, Opera 9+ Safari 4+?

The changes I have on my personal 'todo list' are:
- - Add labels to all forms.
- - Make tables consistent width (see for example the table in the top  
of the "Browse packages" page and compare with the table when you  
actually select one of the classifiers).
- - Restyle the metadata display on package pages and move it up in  
the page.
- - Have downloads readily available on the right side of the screen  
(at least the latest release).
- - Look sternly at the top right floating account information page.
- - Look at the your details page where the form does not align with  
the right floating profile box.
- - Make one consistent styling for all forms. Include help texts in  
all forms.

There are more things I want to do, but this is the start.

I have already cloned Tarek's PyPI clone on Bitbucket and I'll add my  
changes there.

Is there anything you guys (and the users) would really like to see  
improved?

Regards,

Simon de Vlieger-

From martin at v.loewis.de  Wed Jun 16 23:51:17 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 16 Jun 2010 23:51:17 +0200
Subject: [Catalog-sig] PyPI template improvements
In-Reply-To: <AAD06C81-9B08-4C51-87CD-4C681D91A09F@ikanobori.jp>
References: <AAD06C81-9B08-4C51-87CD-4C681D91A09F@ikanobori.jp>
Message-ID: <4C194755.2060704@v.loewis.de>

> - - Is there a list of improvements, maybe a nice TODO of points which
> people want to see improved?

The bug tracker: sf.net/projects/pypi

> - - How are design changes handled, is there a committee to run them
> through? People who decide on what gets in and what not? (I'll outline
> some of my first thoughts lower in this mail)

No. There are virtually no design changes being proposed that actually 
come with a patch, so nothing needs to be decided on.

> - - What are the supported browser versions by PyPI, I reckon it's
> IE6/7/8+, Fx 2+, Opera 9+ Safari 4+?

What do you mean by "supported"? Officially supported, so that you can 
make a help desk call if it won't work? None.

Or do you mean that the browser should be able to use the site? All of 
them, plus any other browser you can think of, including Lynx and wget.

> The changes I have on my personal 'todo list' are:
> - - Add labels to all forms.

Please submit a patch. I have no clue what a label of a form is.

> - - Make tables consistent width (see for example the table in the top
> of the "Browse packages" page and compare with the table when you
> actually select one of the classifiers).

Again, please submit a patch.

> - - Restyle the metadata display on package pages and move it up in the
> page.

Please submit a patch; this would probably need to get support of this list.

> - - Have downloads readily available on the right side of the screen (at
> least the latest release).

Not sure what that means; please submit a patch.

> - - Look sternly at the top right floating account information page.

Hmm. Whom do you want to look sternly?

> There are more things I want to do, but this is the start.

The key here really is "I ... do". This sounds good.

Regards,
Martin

From lists at zopyx.com  Thu Jun 17 06:22:32 2010
From: lists at zopyx.com (Andreas Jung)
Date: Thu, 17 Jun 2010 06:22:32 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
Message-ID: <4C19A308.5040806@zopyx.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi there,

I propose a policy change for packages registered with PyPI:

 - packages registered on PyPI have at least one release

 - one release of registered package on PyPI _must_ contain
   a valid source code distribution (sdist)

 - packages registered on PyPI without releases or without
   source code release are subject to be removed after N days
   after the day of registration

Why?

Any package registered on PyPI is possibly crucial to any kind of
development and deployment.

Packages hosted on external servers (referenced through a download_url)
are subject to come and go - packages once released should be available
at any time from a well-known location (PyPI). Dependencies on the
availability of external downloads servers other than PyPI are hardly
acceptable for real-world development and deployments.

As an example: the Plone CMS buildouts depend on python-openid.
This package is registered with PyPI

http://pypi.python.org/pypi/python-openid

but references to

http://openidenabled.com/files/python-openid/packages/python-openid-2.2.4.tar.gz

For whatever reason the download URL is no longer working. In fact:
openidenabled.com now points to http://www.janrain.com.

Other reasons for disappearing package in the past:

 - network or server outages of external servers
 - users changed their organization and the organization removed
   content of their former employees

PyPI is a valuable and crucial resource for Python development.
It must be kept up-to-date and consistent.

I don't care about the arguments that were made in the past against
stronger rules ("openness" etc.).

There are a lot of Python programmers around that are not Python geeks
as most of us are and they just become pissed of when packages come and
go or are not in the place where one would expect them.

PyPI is a community resource - but community does not mean anarchy where
everyone should be able to upload its package crap without looking left
and right and having the community and its needs in mind.

PyPI must become a stable package index. Everything registered with PyPI
must be available at any time (mirrors, distributing PyPI in the cloud...).

Andreas

- -- 
ZOPYX Limited           | zopyx group
Charlottenstr. 37/1     | The full-service network for Zope & Plone
D-72070 T?bingen        | Produce & Publish
www.zopyx.com           | www.produce-and-publish.com
- ------------------------------------------------------------------------
E-Publishing, Python, Zope & Plone development, Consulting


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwZowgACgkQCJIWIbr9KYyclQCglMaIFnObClOn3sPfwBWbnV1w
YboAoL8OSErCHFi0nXD4tbF8VnYgbc/i
=3m/N
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lists.vcf
Type: text/x-vcard
Size: 316 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100617/060ff782/attachment-0001.vcf>

From ianb at colorstudy.com  Thu Jun 17 06:30:46 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed, 16 Jun 2010 23:30:46 -0500
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <4C1919F1.9080506@v.loewis.de>
References: <4C1768AF.9040606@egenix.com>
	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com> 
	<4C17A419.4060602@egenix.com>
	<AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com> 
	<A447F7AF-18D4-4E07-8F9B-BDA0E6BC92D2@mac.com>
	<AANLkTilm3RLO5Z5uJzCtjYUqmACLOl-jWucgQv13IicH@mail.gmail.com> 
	<4C17BBC3.3050205@egenix.com>
	<AANLkTikXWnAlk_MVppqgCX6WhouDvqC0w8fPUJRpR6HK@mail.gmail.com> 
	<loom.20100616T135105-60@post.gmane.org> <4C1919F1.9080506@v.loewis.de>
Message-ID: <AANLkTiloZrUuVoz0ZVPVi8Ih3uxnAgFqtih_mkcZZjaJ@mail.gmail.com>

On Wed, Jun 16, 2010 at 1:37 PM, "Martin v. L?wis" <martin at v.loewis.de>wrote:

>  It is likely that some people will setup a mirror and then "forget" to
>> take care
>> about it. Like our buildbots really.
>>
>
>
> The same can happen to any infrastructure, though. Amazon may decide to
> change the setup, and then the automated update procedure would break.
> Of course, they would give advance notice - but then somebody would
> have to react to that advance notice.
>

That's not very likely, and if something does change it will be extremely
well announced and documented.  Amazon is providing a commercial service
lots of people rely on, their process is formalized and professionalized.
And if Amazon makes mistakes they'll figure out how to avoid them next time,
while mirror providers are a rotating crew that is unlikely to easily or
reliably learn from past mistakes.  If we actually understood each time PyPI
broke and fixed it none of this would be a problem; I'm not blaming anyone
for that, but it's also not going to change and adding lots of mirror
systems just adds more systems with exactly the same management problems
that our current system has.

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100616/0edc6d1d/attachment.html>

From aclark at aclark.net  Thu Jun 17 07:11:41 2010
From: aclark at aclark.net (Alex Clark)
Date: Thu, 17 Jun 2010 01:11:41 -0400
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C19A308.5040806@zopyx.com>
References: <4C19A308.5040806@zopyx.com>
Message-ID: <hvcaqe$3u9$1@dough.gmane.org>

Hi,


Andreas Jung wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi there,
>
> I propose a policy change for packages registered with PyPI:
>
>   - packages registered on PyPI have at least one release
>
>   - one release of registered package on PyPI _must_ contain
>     a valid source code distribution (sdist)
>
>   - packages registered on PyPI without releases or without
>     source code release are subject to be removed after N days
>     after the day of registration
>
> Why?
>
> Any package registered on PyPI is possibly crucial to any kind of
> development and deployment.
>
> Packages hosted on external servers (referenced through a download_url)
> are subject to come and go - packages once released should be available
> at any time from a well-known location (PyPI). Dependencies on the
> availability of external downloads servers other than PyPI are hardly
> acceptable for real-world development and deployments.
>
> As an example: the Plone CMS buildouts depend on python-openid.
> This package is registered with PyPI
>
> http://pypi.python.org/pypi/python-openid
>
> but references to
>
> http://openidenabled.com/files/python-openid/packages/python-openid-2.2.4.tar.gz
>
> For whatever reason the download URL is no longer working. In fact:
> openidenabled.com now points to http://www.janrain.com.

FWIW, I have uploaded a local copy of that file to:

http://dist.plone.org/thirdparty/python-openid-2.2.4.tar.gz


>
> Other reasons for disappearing package in the past:
>
>   - network or server outages of external servers
>   - users changed their organization and the organization removed
>     content of their former employees
>
> PyPI is a valuable and crucial resource for Python development.
> It must be kept up-to-date and consistent.
>
> I don't care about the arguments that were made in the past against
> stronger rules ("openness" etc.).
>
> There are a lot of Python programmers around that are not Python geeks
> as most of us are and they just become pissed of when packages come and
> go or are not in the place where one would expect them.
>
> PyPI is a community resource - but community does not mean anarchy where
> everyone should be able to upload its package crap without looking left
> and right and having the community and its needs in mind.
>
> PyPI must become a stable package index. Everything registered with PyPI
> must be available at any time (mirrors, distributing PyPI in the cloud...).
>
> Andreas
>
> - --
> ZOPYX Limited           | zopyx group
> Charlottenstr. 37/1     | The full-service network for Zope&  Plone
> D-72070 T?bingen        | Produce&  Publish
> www.zopyx.com           | www.produce-and-publish.com
> - ------------------------------------------------------------------------
> E-Publishing, Python, Zope&  Plone development, Consulting
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (Darwin)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAkwZowgACgkQCJIWIbr9KYyclQCglMaIFnObClOn3sPfwBWbnV1w
> YboAoL8OSErCHFi0nXD4tbF8VnYgbc/i
> =3m/N
> -----END PGP SIGNATURE-----
>
>
>
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig


-- 
Alex Clark ? http://aclark.net
Author ? Plone 3.3 Site Administration ? http://aclark.net/admin


From sridharr at activestate.com  Thu Jun 17 08:01:08 2010
From: sridharr at activestate.com (Sridhar)
Date: Wed, 16 Jun 2010 23:01:08 -0700
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C19A308.5040806@zopyx.com>
References: <4C19A308.5040806@zopyx.com>
Message-ID: <4C19BA24.7020709@activestate.com>

On 6/16/2010 9:22 PM, Andreas Jung wrote:
> As an example: the Plone CMS buildouts depend on python-openid.
> This package is registered with PyPI
>
> http://pypi.python.org/pypi/python-openid
>
> but references to
>
> http://openidenabled.com/files/python-openid/packages/python-openid-2.2.4.tar.gz
>
> For whatever reason the download URL is no longer working. In fact:
> openidenabled.com now points tohttp://www.janrain.com.
>    

This is one of the limitations with z3c.pypimirror that prompted me to 
write my own "mirroring" solution. I have a configuration file which 
allows me to "override" package metadata for such "crap" data in PyPI. 
Things like PyPI entry for a package pointing to an older version of 
tarball, no tarball at all or broken link such as the one you mentioned 
here.

> PyPI is a valuable and crucial resource for Python development.
> It must be kept up-to-date and consistent.
>
> I don't care about the arguments that were made in the past against
> stronger rules ("openness" etc.).
>
> There are a lot of Python programmers around that are not Python geeks
> as most of us are and they just become pissed of when packages come and
> go or are not in the place where one would expect them.
>
> PyPI is a community resource - but community does not mean anarchy where
> everyone should be able to upload its package crap without looking left
> and right and having the community and its needs in mind.
>
> PyPI must become a stable package index. Everything registered with PyPI
> must be available at any time (mirrors, distributing PyPI in the cloud...).
>    

BTW, I posted a similar proposal in distutils-sig@ before, and it lead 
to nowhere. I have no hope as to this one either. :-/

So much for participating in a community.

-srid

From cz at gocept.com  Thu Jun 17 08:11:19 2010
From: cz at gocept.com (Christian Zagrodnick)
Date: Thu, 17 Jun 2010 08:11:19 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
References: <4C19A308.5040806@zopyx.com>
Message-ID: <hvcea7$cb5$1@dough.gmane.org>

On 2010-06-17 06:22:32 +0200, Andreas Jung <lists at zopyx.com> said:

> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hi there,
> 
> I propose a policy change for packages registered with PyPI:
> 
>  - packages registered on PyPI have at least one release
> 
>  - one release of registered package on PyPI _must_ contain
>    a valid source code distribution (sdist)
> 
>  - packages registered on PyPI without releases or without
>    source code release are subject to be removed after N days
>    after the day of registration
> 
> Why?
> 
> Any package registered on PyPI is possibly crucial to any kind of
> development and deployment.
> 
> Packages hosted on external servers (referenced through a download_url)
> are subject to come and go - packages once released should be available
> at any time from a well-known location (PyPI). Dependencies on the
> availability of external downloads servers other than PyPI are hardly
> acceptable for real-world development and deployments.

I second that. External download URLs are really a pain.

I don't think that removing packages that way would really solve the 
problem. I think the core is:

* Require the package to have a source dist *on* PyPI
* Forbid removing any source package.

[...]

> PyPI must become a stable package index. Everything registered with PyPI
> must be available at any time (mirrors, distributing PyPI in the cloud...=
> ).

ack.


-- 
Christian Zagrodnick ? cz at gocept.com
gocept gmbh & co. kg ? forsterstra?e 29 ? 06112 halle (saale) ? germany
http://gocept.com ? tel +49 345 1229889 4 ? fax +49 345 1229889 1
Zope and Plone consulting and development



From martin at v.loewis.de  Thu Jun 17 08:58:40 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 17 Jun 2010 08:58:40 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C19A308.5040806@zopyx.com>
References: <4C19A308.5040806@zopyx.com>
Message-ID: <4C19C7A0.9080800@v.loewis.de>

> I propose a policy change for packages registered with PyPI:
>
>   - packages registered on PyPI have at least one release
>
>   - one release of registered package on PyPI _must_ contain
>     a valid source code distribution (sdist)
>
>   - packages registered on PyPI without releases or without
>     source code release are subject to be removed after N days
>     after the day of registration

So how would you implement that policy change? Please propose a phased 
approach, that gives affected people plenty of options to intervene if
they disagree with the policy.

Regards,
Martin

From lists at zopyx.com  Thu Jun 17 09:09:55 2010
From: lists at zopyx.com (Andreas Jung)
Date: Thu, 17 Jun 2010 09:09:55 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C19C7A0.9080800@v.loewis.de>
References: <4C19A308.5040806@zopyx.com> <4C19C7A0.9080800@v.loewis.de>
Message-ID: <4C19CA43.9000509@zopyx.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin v. L?wis wrote:
>> I propose a policy change for packages registered with PyPI:
>>
>>   - packages registered on PyPI have at least one release
>>
>>   - one release of registered package on PyPI _must_ contain
>>     a valid source code distribution (sdist)
>>
>>   - packages registered on PyPI without releases or without
>>     source code release are subject to be removed after N days
>>     after the day of registration
> 
> So how would you implement that policy change? Please propose a phased
> approach, that gives affected people plenty of options to intervene if
> they disagree with the policy.
> 

It should be fairly easy to figure out affected packages through some
DB query (in fact a similar functionality is already implemented on top
of the XMLRPC API in my zopyx.trashfinder package).

For such packages: send out an email to the package maintainer informing
him about the problem and instructing him to fix the problem within N days.

After N days: recheck the package state and unregister the package if
necessary.

Or perhaps a less rude approach: introduce status field for each package
(ACTIVE/INACTIVE) and set the state to INACTIVE when the package does
not comply with this policy. Inactive packages won't be listed on PyPI
and won't be searchable on PyPI. Inactive status should be visible
to the author (in logged-in state) with some warning "Package is
inactive..please upload your sdist....).

Andreas



-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwZykMACgkQCJIWIbr9KYy81wCfWjjQ8yTQbhO6xIfqPYiHQHcc
44sAn2YYFxFPHwJ0PywX306DcMOcabix
=UtO+
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lists.vcf
Type: text/x-vcard
Size: 316 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100617/6afb8191/attachment.vcf>

From marrakis at gmail.com  Thu Jun 17 09:27:27 2010
From: marrakis at gmail.com (Mathieu Leduc-Hamel)
Date: Thu, 17 Jun 2010 09:27:27 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <362E7782-303B-4ED1-803A-EA82762F6365@activestate.com>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de>
	<362E7782-303B-4ED1-803A-EA82762F6365@activestate.com>
Message-ID: <AANLkTilC0V2Bb5lbY5H4ym-AhBzfZJXgQArvvxC9nD0F@mail.gmail.com>

Yeah for sure there's different implementation of Pypi in django or with
other framework.

You can check this one too: http://pypi.python.org/pypi/chishop/0.2.0

<http://pypi.python.org/pypi/chishop/0.2.0>But the question was not
necessarily how difficult it was to do it but if it would acceptable by the
community, but we are on the right list to discuss that.

have you try it, is it working properly ?

On Wed, Jun 16, 2010 at 7:06 PM, Sridhar Ratnakumar <
sridharr at activestate.com> wrote:

>
> On 2010-06-11, at 1:56 PM, Martin v. L?wis wrote:
>
> > If you are willing to invest *a lot* of time, then it seems that
> rewriting PyPI in Django would make a lot of people happy, because
> > they claim they can't contribute to the current code base because
> > they don't understand that. I don't want to do such a rewrite on
> > my own because I *do* understand the code base (despite not having
> written it in the first place, so I think that if you really want
> > to contribute, you can learn how it works); it also violates Joel
> > Spolsky's principle of never ever doing rewrites.
>
> FYI: I just happened to stumble upon what claims to be a "re-implementation
> of PyPI" in Django:
> http://pypi.python.org/pypi/djangopypi/0.4
>
> -srid
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100617/419b7f4f/attachment.html>

From martin at v.loewis.de  Thu Jun 17 09:36:01 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 17 Jun 2010 09:36:01 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C19CA43.9000509@zopyx.com>
References: <4C19A308.5040806@zopyx.com> <4C19C7A0.9080800@v.loewis.de>
	<4C19CA43.9000509@zopyx.com>
Message-ID: <4C19D061.5020303@v.loewis.de>

> For such packages: send out an email to the package maintainer informing
> him about the problem and instructing him to fix the problem within N days.
>
> After N days: recheck the package state and unregister the package if
> necessary.
>
> Or perhaps a less rude approach: introduce status field for each package
> (ACTIVE/INACTIVE) and set the state to INACTIVE when the package does
> not comply with this policy. Inactive packages won't be listed on PyPI
> and won't be searchable on PyPI. Inactive status should be visible
> to the author (in logged-in state) with some warning "Package is
> inactive..please upload your sdist....).

Ok. If nobody opposes to this right now, it's fine with me as well.
However, I won't be able to work on this for several months to come.

IMO, it's a waste of energy: if a package is useless, just don't use it, 
and be done. There are many packages on PyPI that are useless to me 
despite having a source release.

Regards,
Martin

From lists at zopyx.com  Thu Jun 17 09:39:52 2010
From: lists at zopyx.com (Andreas Jung)
Date: Thu, 17 Jun 2010 09:39:52 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C19D061.5020303@v.loewis.de>
References: <4C19A308.5040806@zopyx.com> <4C19C7A0.9080800@v.loewis.de>
	<4C19CA43.9000509@zopyx.com> <4C19D061.5020303@v.loewis.de>
Message-ID: <4C19D148.4000308@zopyx.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin v. L?wis wrote:

> IMO, it's a waste of energy: if a package is useless, just don't use it,
> and be done. There are many packages on PyPI that are useless to me
> despite having a source release.
>

"useless" is not the point. The "availability" matters - the
availability of package must not depend externals servers other than an
official PyPI server.

Andreas
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwZ0UcACgkQCJIWIbr9KYxILwCfSEEdo+Eod9xYSjIVdrNzbBir
X3MAoL/78mNwU52k0K4dkWHkQO+4F//s
=Nnpq
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lists.vcf
Type: text/x-vcard
Size: 316 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100617/0be7c749/attachment.vcf>

From mal at egenix.com  Thu Jun 17 09:54:50 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 17 Jun 2010 09:54:50 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C19A308.5040806@zopyx.com>
References: <4C19A308.5040806@zopyx.com>
Message-ID: <4C19D4CA.1090304@egenix.com>

Andreas Jung wrote:
> Hi there,
> 
> I propose a policy change for packages registered with PyPI:
> 
>  - packages registered on PyPI have at least one release

I'm not sure what you mean with "release". Every package on
PyPI is a release, since it comes with a version number.

>  - one release of registered package on PyPI _must_ contain
>    a valid source code distribution (sdist)

-100

You'd outrule commercial packages that don't come with a
source distribution. PyPI is for everyone, not only for
open source packages.

Furthermore, not all package authors want to upload their
packages to PyPI.

And lastly, uploading packages to PyPI (still) has a serious
problem: setuptools doesn't know the distinction between
UCS2 and UCS4, so uploading eggs for Unix platforms doesn't
work out in practice. setuptools also doesn't know that
e.g. a Mac OS X fat release may still contain the right binaries
for a non-fat build of Python.

There are other issues as well, e.g. eGenix produces around
50 release files for every package release amounting to
around 150 MB in some cases. It's currently just not feasable to
use PyPI for that.

>  - packages registered on PyPI without releases or without
>    source code release are subject to be removed after N days
>    after the day of registration

Same as above.

> Why?
> 
> Any package registered on PyPI is possibly crucial to any kind of
> development and deployment.
> 
> Packages hosted on external servers (referenced through a download_url)
> are subject to come and go - packages once released should be available
> at any time from a well-known location (PyPI). Dependencies on the
> availability of external downloads servers other than PyPI are hardly
> acceptable for real-world development and deployments.

I think it's for the package users to decide whether they
trust a package author to maintain his or her package.
That's not something PyPI can change.

> As an example: the Plone CMS buildouts depend on python-openid.
> This package is registered with PyPI
> 
> http://pypi.python.org/pypi/python-openid
> 
> but references to
> 
> http://openidenabled.com/files/python-openid/packages/python-openid-2.2.4.tar.gz
> 
> For whatever reason the download URL is no longer working. In fact:
> openidenabled.com now points to http://www.janrain.com.

That's a problem with that particular package, so you should
contact the package author.

Just because one URL goes away doesn't mean that *all* PyPI
package authors who host their software elsewhere are in
poor standing.

> Other reasons for disappearing package in the past:
> 
>  - network or server outages of external servers
>  - users changed their organization and the organization removed
>    content of their former employees

I'd say you open a support request for PyPI and then
let a sys admin add a note to the package or remove the
broken download URL.

> PyPI is a valuable and crucial resource for Python development.
> It must be kept up-to-date and consistent.
> 
> I don't care about the arguments that were made in the past against
> stronger rules ("openness" etc.).

If that's so, but why should we then care about your arguments ?

> There are a lot of Python programmers around that are not Python geeks
> as most of us are and they just become pissed of when packages come and
> go or are not in the place where one would expect them.

That's the nature of the Internet. Besides, would you really want
to use a package that's not being maintained anymore ? Even if you do
have a source or binary distribution for a package on PyPI, would
you really continue to use it if you don't know the author
and it hadn't had any release for 3 years ?

You can't just blindly rely on things that were uploaded to
PyPI and the proposed policy change won't make a difference in
that respect.

> PyPI is a community resource - but community does not mean anarchy where
> everyone should be able to upload its package crap without looking left
> and right and having the community and its needs in mind.

I think that's asked a bit too much of the package authors. PyPI
is just a resource to announce and catalog Python packages, nothing
more.

> PyPI must become a stable package index. Everything registered with PyPI
> must be available at any time (mirrors, distributing PyPI in the cloud...).

I agree that everything uploaded to PyPI should be available
anytime, but not that everything registered with PyPI also
has to be uploaded to PyPI.

Making PyPI more reliable will likely increase the number of
package authors who trust PyPI to host their packages.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 17 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                31 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From mal at egenix.com  Thu Jun 17 09:57:52 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 17 Jun 2010 09:57:52 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C19D061.5020303@v.loewis.de>
References: <4C19A308.5040806@zopyx.com>
	<4C19C7A0.9080800@v.loewis.de>	<4C19CA43.9000509@zopyx.com>
	<4C19D061.5020303@v.loewis.de>
Message-ID: <4C19D580.7080909@egenix.com>

"Martin v. L?wis" wrote:
>> For such packages: send out an email to the package maintainer informing
>> him about the problem and instructing him to fix the problem within N
>> days.
>>
>> After N days: recheck the package state and unregister the package if
>> necessary.
>>
>> Or perhaps a less rude approach: introduce status field for each package
>> (ACTIVE/INACTIVE) and set the state to INACTIVE when the package does
>> not comply with this policy. Inactive packages won't be listed on PyPI
>> and won't be searchable on PyPI. Inactive status should be visible
>> to the author (in logged-in state) with some warning "Package is
>> inactive..please upload your sdist....).
> 
> Ok. If nobody opposes to this right now, it's fine with me as well.
> However, I won't be able to work on this for several months to come.
> 
> IMO, it's a waste of energy: if a package is useless, just don't use it,
> and be done. There are many packages on PyPI that are useless to me
> despite having a source release.

Agreed.

PyPI can't replace the due-diligence that every package user has to
apply before making a choice to invest time into using it.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 17 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                31 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From lists at zopyx.com  Thu Jun 17 10:05:25 2010
From: lists at zopyx.com (Andreas Jung)
Date: Thu, 17 Jun 2010 10:05:25 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C19D4CA.1090304@egenix.com>
References: <4C19A308.5040806@zopyx.com> <4C19D4CA.1090304@egenix.com>
Message-ID: <4C19D745.3050900@zopyx.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

M.-A. Lemburg wrote:
> Andreas Jung wrote:
>> Hi there,
>>
>> I propose a policy change for packages registered with PyPI:
>>
>>  - packages registered on PyPI have at least one release
> 
> I'm not sure what you mean with "release". Every package on
> PyPI is a release, since it comes with a version number.

This is a package without a release:

http://pypi.python.org/pypi/python-openid

> 
>>  - one release of registered package on PyPI _must_ contain
>>    a valid source code distribution (sdist)
> 
> -100
> 
> You'd outrule commercial packages that don't come with a
> source distribution. PyPI is for everyone, not only for
> open source packages.

Commercial package are a special case - I agree. The majority
of all PyPI are non-commercial. In addition you could also
upload binary release in addition to your own download server.


> 
> Furthermore, not all package authors want to upload their
> packages to PyPI.

And this is _exactly_ the problem. If you are a package author
and want to make your packages available to the public through PyPI,
you should be obligated for publishing the related distribution
files on PyPI: for the sake of availability and in order for being
independent of your own infrastructure. Otherwise I have the (arrogant)
opinion: go away - if you are a package author and want to use PyPI:
ensure that your software is available to everyone at any time.

PyPI is not a kindergarten - PyPI is an important resource for
professional Python development. CPAN is better organized and more
reliable for more than ten years than PyPI ever was.

Andreas


- -- 
ZOPYX Limited           | zopyx group
Charlottenstr. 37/1     | The full-service network for Zope & Plone
D-72070 T?bingen        | Produce & Publish
www.zopyx.com           | www.produce-and-publish.com
- ------------------------------------------------------------------------
E-Publishing, Python, Zope & Plone development, Consulting


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwZ10UACgkQCJIWIbr9KYxXwACfSpGgjaEE1Yk9+UYk7nBqodJr
cfsAn2SlxwFAhXn/LIiOC4TnOEI0F31t
=qxLs
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lists.vcf
Type: text/x-vcard
Size: 316 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100617/4adb9e49/attachment.vcf>

From jannis at leidel.info  Thu Jun 17 10:14:31 2010
From: jannis at leidel.info (Jannis Leidel)
Date: Thu, 17 Jun 2010 10:14:31 +0200
Subject: [Catalog-sig] PyPI down again...
In-Reply-To: <AANLkTilC0V2Bb5lbY5H4ym-AhBzfZJXgQArvvxC9nD0F@mail.gmail.com>
References: <4C121377.4000008@simplistix.co.uk>
	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>
	<4C127DD4.5010801@v.loewis.de>
	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>
	<4C12A2E4.2090305@v.loewis.de>
	<362E7782-303B-4ED1-803A-EA82762F6365@activestate.com>
	<AANLkTilC0V2Bb5lbY5H4ym-AhBzfZJXgQArvvxC9nD0F@mail.gmail.com>
Message-ID: <60277BDC-FB55-4901-A7BE-AD67ED6D35E3@leidel.info>


Am 17.06.2010 um 09:27 schrieb Mathieu Leduc-Hamel:

> Yeah for sure there's different implementation of Pypi in django or with other framework.
> 
> You can check this one too: http://pypi.python.org/pypi/chishop/0.2.0

FYI, djangopypi is a fork of chishop to separate the reusable and example server parts better. I already contributed a few patches lately and will keep working on it over the summer.

> But the question was not necessarily how difficult it was to do it but if it would acceptable by the community, but we are on the right list to discuss that.
> 
> have you try it, is it working properly ?

It worked in my manual tests but needs more testing with easy_install, et al. If anyone is interested, there is a buildout config included in the repository [1] which should get you up and running quickly.

Best,
Jannis

1: http://github.com/benliles/chishop


> On Wed, Jun 16, 2010 at 7:06 PM, Sridhar Ratnakumar <sridharr at activestate.com> wrote:
> 
> On 2010-06-11, at 1:56 PM, Martin v. L?wis wrote:
> 
> > If you are willing to invest *a lot* of time, then it seems that rewriting PyPI in Django would make a lot of people happy, because
> > they claim they can't contribute to the current code base because
> > they don't understand that. I don't want to do such a rewrite on
> > my own because I *do* understand the code base (despite not having written it in the first place, so I think that if you really want
> > to contribute, you can learn how it works); it also violates Joel
> > Spolsky's principle of never ever doing rewrites.
> 
> FYI: I just happened to stumble upon what claims to be a "re-implementation of PyPI" in Django:
> http://pypi.python.org/pypi/djangopypi/0.4
> 
> -srid
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
> 
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig


From mal at egenix.com  Thu Jun 17 10:28:25 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 17 Jun 2010 10:28:25 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C19D745.3050900@zopyx.com>
References: <4C19A308.5040806@zopyx.com> <4C19D4CA.1090304@egenix.com>
	<4C19D745.3050900@zopyx.com>
Message-ID: <4C19DCA9.5010308@egenix.com>

Andreas Jung wrote:
> M.-A. Lemburg wrote:
>> Andreas Jung wrote:
>>> Hi there,
>>>
>>> I propose a policy change for packages registered with PyPI:
>>>
>>>  - packages registered on PyPI have at least one release
> 
>> I'm not sure what you mean with "release". Every package on
>> PyPI is a release, since it comes with a version number.
> 
> This is a package without a release:
> 
> http://pypi.python.org/pypi/python-openid

It has a name and a version number, so it's a release. It may
be an unavailable release, just like say, Windows 98, is not
available anymore - and that didn't have a source release
file to download either :-)

And I can see that you've added a comment to the package
that the download URL is not working - that's good, since
it will warn users to double-check.

>>>  - one release of registered package on PyPI _must_ contain
>>>    a valid source code distribution (sdist)
> 
>> -100
> 
>> You'd outrule commercial packages that don't come with a
>> source distribution. PyPI is for everyone, not only for
>> open source packages.
> 
> Commercial package are a special case - I agree. The majority
> of all PyPI are non-commercial. In addition you could also
> upload binary release in addition to your own download server.

See my other comments: we might want to do that in the future,
but at the moment, uploading 50 release files with around
150MB every time we do a release is not within range.

>> Furthermore, not all package authors want to upload their
>> packages to PyPI.
> 
> And this is _exactly_ the problem. If you are a package author
> and want to make your packages available to the public through PyPI,
> you should be obligated for publishing the related distribution
> files on PyPI: for the sake of availability and in order for being
> independent of your own infrastructure. Otherwise I have the (arrogant)
> opinion: go away - if you are a package author and want to use PyPI:
> ensure that your software is available to everyone at any time.

What about those package authors who host their package
elsewhere for various reasons and *do* make sure that their
infrastructure is available - even if PyPI is down ?

I have the feeling that you had a problem with that one
package you mentioned and the proposal was just a reaction
to the associated anger with that.

It's not fair to start policing all packages on PyPI just
because of that one incident you had.

> PyPI is not a kindergarten - PyPI is an important resource for
> professional Python development. CPAN is better organized and more
> reliable for more than ten years than PyPI ever was.

To be fair, CPAN has been around a lot longer than PyPI.

Regarding reliability of PyPI: as you've probably seen, I'm
taking that seriously and want to enhance the reliability of PyPI.

Regarding PyPI being used as resource for professional development:
the zc.buildout approach has taken that idea a bit far, IMHO.

PyPI wasn't designed to be used by automated download and
installation tools that install hundreds of packages as opposed
to the few packages that users request manually via easy_install.

It's good to see, that PyPI can still cope with that approach
and pushing the data to the cloud and/or mirror servers will
enhance that performance even more.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 17 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                31 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From lists at zopyx.com  Thu Jun 17 10:40:15 2010
From: lists at zopyx.com (Andreas Jung)
Date: Thu, 17 Jun 2010 10:40:15 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C19DCA9.5010308@egenix.com>
References: <4C19A308.5040806@zopyx.com> <4C19D4CA.1090304@egenix.com>
	<4C19D745.3050900@zopyx.com> <4C19DCA9.5010308@egenix.com>
Message-ID: <4C19DF6F.9050106@zopyx.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

M.-A. Lemburg wrote:
> Andreas Jung wrote:
>> M.-A. Lemburg wrote:
>>> Andreas Jung wrote:
>>>> Hi there,
>>>>
>>>> I propose a policy change for packages registered with PyPI:
>>>>
>>>>  - packages registered on PyPI have at least one release
>>> I'm not sure what you mean with "release". Every package on
>>> PyPI is a release, since it comes with a version number.
>> This is a package without a release:
>>
>> http://pypi.python.org/pypi/python-openid
> 
> It has a name and a version number, so it's a release. It may
> be an unavailable release, just like say, Windows 98, is not
> available anymore - and that didn't have a source release
> file to download either :-)

I don't care if it has a name and a version number. I was not able
to work on my project - other co-workers also complained...this
is a not acceptable situation...as Python geek I can likely deal with
that, others can't :)

> 
> And I can see that you've added a comment to the package
> that the download URL is not working - that's good, since
> it will warn users to double-check.
> 
>>>>  - one release of registered package on PyPI _must_ contain
>>>>    a valid source code distribution (sdist)
>>> -100
>>> You'd outrule commercial packages that don't come with a
>>> source distribution. PyPI is for everyone, not only for
>>> open source packages.
>> Commercial package are a special case - I agree. The majority
>> of all PyPI are non-commercial. In addition you could also
>> upload binary release in addition to your own download server.
> 
> See my other comments: we might want to do that in the future,
> but at the moment, uploading 50 release files with around
> 150MB every time we do a release is not within range.

Point taken - but as said: your case is likely different.
When we do releases in the Zope world we also have to deal with lots
of packages...so doable somehow :)



>>> Furthermore, not all package authors want to upload their
>>> packages to PyPI.
>> And this is _exactly_ the problem. If you are a package author
>> and want to make your packages available to the public through PyPI,
>> you should be obligated for publishing the related distribution
>> files on PyPI: for the sake of availability and in order for being
>> independent of your own infrastructure. Otherwise I have the (arrogant)
>> opinion: go away - if you are a package author and want to use PyPI:
>> ensure that your software is available to everyone at any time.
> 
> What about those package authors who host their package
> elsewhere for various reasons and *do* make sure that their
> infrastructure is available - even if PyPI is down ?
> 
> I have the feeling that you had a problem with that one
> package you mentioned and the proposal was just a reaction
> to the associated anger with that.
> 
> It's not fair to start policing all packages on PyPI just
> because of that one incident you had.

We had such issues over and over again over the last years.
A typical Zope/Plone installation requires over hundred different
packages and we have seen such failures with external servers
various times. The workaround was creating PyPI mirrors, project related
mirrors or download caches....just workarounds but not really a reliable
and working infrastructure..

Andreas


- -- 
ZOPYX Limited           | zopyx group
Charlottenstr. 37/1     | The full-service network for Zope & Plone
D-72070 T?bingen        | Produce & Publish
www.zopyx.com           | www.produce-and-publish.com
- ------------------------------------------------------------------------
E-Publishing, Python, Zope & Plone development, Consulting


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwZ328ACgkQCJIWIbr9KYxyrACdESkhtKnlZmyBFc6SMnuY+1an
E70AoKrzyzcrCsLMrftXKAfz9UPtbcD5
=QFQd
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lists.vcf
Type: text/x-vcard
Size: 316 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100617/0f5a925e/attachment-0001.vcf>

From mal at egenix.com  Thu Jun 17 10:59:53 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 17 Jun 2010 10:59:53 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C19DF6F.9050106@zopyx.com>
References: <4C19A308.5040806@zopyx.com>
	<4C19D4CA.1090304@egenix.com>	<4C19D745.3050900@zopyx.com>
	<4C19DCA9.5010308@egenix.com> <4C19DF6F.9050106@zopyx.com>
Message-ID: <4C19E409.8060603@egenix.com>

Andreas Jung wrote:
> M.-A. Lemburg wrote:
>>>> Furthermore, not all package authors want to upload their
>>>> packages to PyPI.
>>> And this is _exactly_ the problem. If you are a package author
>>> and want to make your packages available to the public through PyPI,
>>> you should be obligated for publishing the related distribution
>>> files on PyPI: for the sake of availability and in order for being
>>> independent of your own infrastructure. Otherwise I have the (arrogant)
>>> opinion: go away - if you are a package author and want to use PyPI:
>>> ensure that your software is available to everyone at any time.
> 
>> What about those package authors who host their package
>> elsewhere for various reasons and *do* make sure that their
>> infrastructure is available - even if PyPI is down ?
> 
>> I have the feeling that you had a problem with that one
>> package you mentioned and the proposal was just a reaction
>> to the associated anger with that.
> 
>> It's not fair to start policing all packages on PyPI just
>> because of that one incident you had.
> 
> We had such issues over and over again over the last years.
> A typical Zope/Plone installation requires over hundred different
> packages and we have seen such failures with external servers
> various times. The workaround was creating PyPI mirrors, project related
> mirrors or download caches....just workarounds but not really a reliable
> and working infrastructure..

I guess it's better to tell the package authors about your
use of their packages and offer them help in hosting their
packages on more reliable infrastructures.

If that doesn't solve your problem, it's likely better
to either setup your own index to override the PyPI one
(should be easy to do in zc.buildout and AFAIK at least
Plone is already doing that), or you stop using
the package and look for alternatives.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 17 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                31 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From lists at zopyx.com  Thu Jun 17 11:05:19 2010
From: lists at zopyx.com (Andreas Jung)
Date: Thu, 17 Jun 2010 11:05:19 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C19E409.8060603@egenix.com>
References: <4C19A308.5040806@zopyx.com>
	<4C19D4CA.1090304@egenix.com>	<4C19D745.3050900@zopyx.com>
	<4C19DCA9.5010308@egenix.com> <4C19DF6F.9050106@zopyx.com>
	<4C19E409.8060603@egenix.com>
Message-ID: <4C19E54F.6030203@zopyx.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

M.-A. Lemburg wrote:

> 
> I guess it's better to tell the package authors about your
> use of their packages and offer them help in hosting their
> packages on more reliable infrastructures.



> 
> If that doesn't solve your problem, it's likely better
> to either setup your own index to override the PyPI one
> (should be easy to do in zc.buildout and AFAIK at least
> Plone is already doing that), or you stop using
> the package and look for alternatives.

Sorry - I disagree completely. As developer I am into developing
software and not into building private infrastructure to get around the
deficiencies of PyPI and the ignorance of some package maintainers
caring about the needs of the developers using their packages.

Andreas
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwZ5U8ACgkQCJIWIbr9KYxrVACdH6G8zDI/6RMjAywRSvUhri8M
F08Anins1oOc3abEMSc4FZggol0cQjXl
=5fgV
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lists.vcf
Type: text/x-vcard
Size: 316 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100617/79ab9022/attachment.vcf>

From mal at egenix.com  Thu Jun 17 11:51:13 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 17 Jun 2010 11:51:13 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C19E54F.6030203@zopyx.com>
References: <4C19A308.5040806@zopyx.com>	<4C19D4CA.1090304@egenix.com>	<4C19D745.3050900@zopyx.com>	<4C19DCA9.5010308@egenix.com>
	<4C19DF6F.9050106@zopyx.com>	<4C19E409.8060603@egenix.com>
	<4C19E54F.6030203@zopyx.com>
Message-ID: <4C19F011.6010501@egenix.com>

Andreas Jung wrote:
> M.-A. Lemburg wrote:
> 
> 
>> I guess it's better to tell the package authors about your
>> use of their packages and offer them help in hosting their
>> packages on more reliable infrastructures.
> 
> 
> 
> 
>> If that doesn't solve your problem, it's likely better
>> to either setup your own index to override the PyPI one
>> (should be easy to do in zc.buildout and AFAIK at least
>> Plone is already doing that), or you stop using
>> the package and look for alternatives.
> 
> Sorry - I disagree completely. As developer I am into developing
> software and not into building private infrastructure to get around the
> deficiencies of PyPI 

Well, we're trying to change those ...

> and the ignorance of some package maintainers
> caring about the needs of the developers using their packages.

... can't help with this, though.

Package authors typically have a wide range of motivations to
write and share software for others to use. They don't
necessarily share your views or see a need to fulfill your
particular requirements.

If you do have a business requirement to rely on their packages,
I'd suggest you'd ask those package authors for a support
contract. That would likely help them adapt to your needs ;-)

Back to your proposal: In your particular case, I don't see
how the proposal would have helped you - under the proposal,
the package would have been removed from the PyPI index,
so either way, there would have been no working automatic
access to the package download links.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 17 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                31 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From kai.diefenbach at iqpp.de  Thu Jun 17 12:27:41 2010
From: kai.diefenbach at iqpp.de (Kai Diefenbach)
Date: Thu, 17 Jun 2010 12:27:41 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
References: <4C19A308.5040806@zopyx.com> <4C19D4CA.1090304@egenix.com>
	<4C19D745.3050900@zopyx.com> <4C19DCA9.5010308@egenix.com>
	<4C19DF6F.9050106@zopyx.com> <4C19E409.8060603@egenix.com>
	<4C19E54F.6030203@zopyx.com> <4C19F011.6010501@egenix.com>
Message-ID: <hvctat$k9$1@dough.gmane.org>

Hi,

On 2010-06-17 11:51:13 +0200, M.-A. Lemburg said:

> Back to your proposal: In your particular case, I don't see
> how the proposal would have helped you - under the proposal,
> the package would have been removed from the PyPI index,
> so either way, there would have been no working automatic
> access to the package download links.

Why?

Crap without source code distribution will never be published so no one 
can ever build a dependency on that.

AJ: "packages once released should be available at any time from a 
well-known location (PyPI)"

Problem solved.

Kai








From mal at egenix.com  Thu Jun 17 12:47:06 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 17 Jun 2010 12:47:06 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <hvctat$k9$1@dough.gmane.org>
References: <4C19A308.5040806@zopyx.com>
	<4C19D4CA.1090304@egenix.com>	<4C19D745.3050900@zopyx.com>
	<4C19DCA9.5010308@egenix.com>	<4C19DF6F.9050106@zopyx.com>
	<4C19E409.8060603@egenix.com>	<4C19E54F.6030203@zopyx.com>
	<4C19F011.6010501@egenix.com> <hvctat$k9$1@dough.gmane.org>
Message-ID: <4C19FD2A.3050801@egenix.com>

Kai Diefenbach wrote:
> Hi,
> 
> On 2010-06-17 11:51:13 +0200, M.-A. Lemburg said:
> 
>> Back to your proposal: In your particular case, I don't see
>> how the proposal would have helped you - under the proposal,
>> the package would have been removed from the PyPI index,
>> so either way, there would have been no working automatic
>> access to the package download links.
> 
> Why?
> 
> Crap without source code distribution will never be published so no one
> can ever build a dependency on that.
>
> AJ: "packages once released should be available at any time from a
> well-known location (PyPI)"
> 
> Problem solved.

Please have a look at the package in question. The only problem
with it is that the download URL registered on PyPI no longer works.
It redirects to the download page where you can find the source
distribution.

Not much or a problem for a user searching for the archives.

Only a problem for setuptools and zc.buildout that don't ship
with enough AI to figure out :-)

To get back to your argument:

Crap *with* source code distribution would still get published,
so people would still build dependencies on it.

How does this solve the problem ?

Note that Andreas wasn't talking about crappy software, he
was only complaining about the fact that automatic downloads
via setuptools sometimes don't work for some packages on PyPI.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 17 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                31 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From do3ccqrv at googlemail.com  Thu Jun 17 13:20:55 2010
From: do3ccqrv at googlemail.com (Patrick Gerken)
Date: Thu, 17 Jun 2010 13:20:55 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
In-Reply-To: <4C19FD2A.3050801@egenix.com>
References: <4C19A308.5040806@zopyx.com> <4C19D4CA.1090304@egenix.com> 
	<4C19D745.3050900@zopyx.com> <4C19DCA9.5010308@egenix.com> 
	<4C19DF6F.9050106@zopyx.com> <4C19E409.8060603@egenix.com> 
	<4C19E54F.6030203@zopyx.com> <4C19F011.6010501@egenix.com> 
	<hvctat$k9$1@dough.gmane.org> <4C19FD2A.3050801@egenix.com>
Message-ID: <AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>

On Thu, Jun 17, 2010 at 12:47, M.-A. Lemburg <mal at egenix.com> wrote:

> Kai Diefenbach wrote:
> > Hi,
> >
> > On 2010-06-17 11:51:13 +0200, M.-A. Lemburg said:
> >
> >> Back to your proposal: In your particular case, I don't see
> >> how the proposal would have helped you - under the proposal,
> >> the package would have been removed from the PyPI index,
> >> so either way, there would have been no working automatic
> >> access to the package download links.
> >
> > Why?
> >
> > Crap without source code distribution will never be published so no one
> > can ever build a dependency on that.
> >
> > AJ: "packages once released should be available at any time from a
> > well-known location (PyPI)"
> >
> > Problem solved.
>
> Please have a look at the package in question. The only problem
> with it is that the download URL registered on PyPI no longer works.
> It redirects to the download page where you can find the source
> distribution.
>

And thats exactly what Andreas' argument is targeting.


> Not much or a problem for a user searching for the archives.
>
> Only a problem for setuptools and zc.buildout that don't ship
> with enough AI to figure out :-)
>

> To get back to your argument:
>
> Crap *with* source code distribution would still get published,
> so people would still build dependencies on it.
>
> How does this solve the problem ?
>

Not putting the source release on pypi is just one indicator of crappy
software.
I agree that this is not a crap indicator for commercial software.

There is a big number of users using tools that download tools in an
automated
fashion from pypi, and it is a reasonable request that source once being
published
to be available forever.

If I understand it correctly, you are against this proposal, that would have
protected users of setuptools/distribute/zc.buildouts from problems due to
python-openid, because it would disallow the publication of information
about commercial packages on pypi?

I see a point in that, but what is more important, having a catalog to
browse or
having a reliable repository of software to download?

As a plone user who uses zc.buildout I very much prefer reliable downloads.
Its not fun
to search for the reason a supposedly repeatable buildout suddenly fails
because
a company decided to rename itself.

How about only listing packages with provided source code on the simple
interface?
afaik buildout always uses that, so a package python-openid is visible in
the
end-user view, but not installable via buildout. That way nobody would ever
have had
created a dependency on it in the first place.

Best regards,

        Patrick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100617/69338fa3/attachment.html>

From mal at egenix.com  Thu Jun 17 13:40:02 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 17 Jun 2010 13:40:02 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>
References: <4C19A308.5040806@zopyx.com> <4C19D4CA.1090304@egenix.com>
	<4C19D745.3050900@zopyx.com> <4C19DCA9.5010308@egenix.com>
	<4C19DF6F.9050106@zopyx.com> <4C19E409.8060603@egenix.com>
	<4C19E54F.6030203@zopyx.com> <4C19F011.6010501@egenix.com>
	<hvctat$k9$1@dough.gmane.org> <4C19FD2A.3050801@egenix.com>
	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>
Message-ID: <4C1A0992.7070507@egenix.com>

Patrick Gerken wrote:
> On Thu, Jun 17, 2010 at 12:47, M.-A. Lemburg <mal at egenix.com> wrote:
> 
>> Kai Diefenbach wrote:
>>> Hi,
>>>
>>> On 2010-06-17 11:51:13 +0200, M.-A. Lemburg said:
>>>
>>>> Back to your proposal: In your particular case, I don't see
>>>> how the proposal would have helped you - under the proposal,
>>>> the package would have been removed from the PyPI index,
>>>> so either way, there would have been no working automatic
>>>> access to the package download links.
>>>
>>> Why?
>>>
>>> Crap without source code distribution will never be published so no one
>>> can ever build a dependency on that.
>>>
>>> AJ: "packages once released should be available at any time from a
>>> well-known location (PyPI)"
>>>
>>> Problem solved.
>>
>> Please have a look at the package in question. The only problem
>> with it is that the download URL registered on PyPI no longer works.
>> It redirects to the download page where you can find the source
>> distribution.
>>
> 
> And thats exactly what Andreas' argument is targeting.
> 
> 
>> Not much or a problem for a user searching for the archives.
>>
>> Only a problem for setuptools and zc.buildout that don't ship
>> with enough AI to figure out :-)
>>
> 
>> To get back to your argument:
>>
>> Crap *with* source code distribution would still get published,
>> so people would still build dependencies on it.
>>
>> How does this solve the problem ?
>>
> 
> Not putting the source release on pypi is just one indicator of crappy
> software.
> I agree that this is not a crap indicator for commercial software.
> 
> There is a big number of users using tools that download tools in an
> automated
> fashion from pypi, and it is a reasonable request that source once being
> published
> to be available forever.
> 
> If I understand it correctly, you are against this proposal, that would have
> protected users of setuptools/distribute/zc.buildouts from problems due to
> python-openid, because it would disallow the publication of information
> about commercial packages on pypi?

What I'm saying is that it's better to contact the package
authors whose entries cause problems than to force some
policy on all PyPI package entries which carelessly puts
packages that are not hosted on PyPI into the same category
as crappy software.

> I see a point in that, but what is more important, having a catalog to
> browse or
> having a reliable repository of software to download?
> 
> As a plone user who uses zc.buildout I very much prefer reliable downloads.
> Its not fun
> to search for the reason a supposedly repeatable buildout suddenly fails
> because
> a company decided to rename itself.

It is well possible to delete package listings on PyPI. Wouldn't
you rather be informed about this by way of an error report in
zc.buildout than by finding that the package name has changed
a few years later ?

> How about only listing packages with provided source code on the simple
> interface?
> afaik buildout always uses that, so a package python-openid is visible in
> the
> end-user view, but not installable via buildout. That way nobody would ever
> have had
> created a dependency on it in the first place.

If such external links are a problem for zc.buildout, why don't
you add an option to zc.buildout that prevents using such
packages ?

This is well possible by checking the /simple index entry
for links to package download files:

http://pypi.python.org/simple/python-openid/

vs.

http://pypi.python.org/simple/zc.buildout/

BTW: what are all those bug links doing on the zc.buildout index page ?
They look a lot like a good possibility for injecting trojans.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 17 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                31 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From lists at zopyx.com  Thu Jun 17 13:55:40 2010
From: lists at zopyx.com (Andreas Jung)
Date: Thu, 17 Jun 2010 13:55:40 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C1A0992.7070507@egenix.com>
References: <4C19A308.5040806@zopyx.com>
	<4C19D4CA.1090304@egenix.com>	<4C19D745.3050900@zopyx.com>
	<4C19DCA9.5010308@egenix.com>	<4C19DF6F.9050106@zopyx.com>
	<4C19E409.8060603@egenix.com>	<4C19E54F.6030203@zopyx.com>
	<4C19F011.6010501@egenix.com>	<hvctat$k9$1@dough.gmane.org>
	<4C19FD2A.3050801@egenix.com>	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>
	<4C1A0992.7070507@egenix.com>
Message-ID: <4C1A0D3C.4050402@zopyx.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

M.-A. Lemburg wrote:

> What I'm saying is that it's better to contact the package
> authors whose entries cause problems than to force some
> policy on all PyPI package entries which carelessly puts
> packages that are not hosted on PyPI into the same category
> as crappy software.

In theory yes, in real life no - I approached several package
maintainers in the past due to several reasons..some agree with the
complaints, others just don't care. Some consider PyPI as their own
private repository with their own rules and no need to care about the
community e.g. by providing proper metadata (I call this anti-social and
PyPI-misuse).


Andreas

- -- 
ZOPYX Limited           | zopyx group
Charlottenstr. 37/1     | The full-service network for Zope & Plone
D-72070 T?bingen        | Produce & Publish
www.zopyx.com           | www.produce-and-publish.com
- ------------------------------------------------------------------------
E-Publishing, Python, Zope & Plone development, Consulting


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwaDTwACgkQCJIWIbr9KYzFdQCdEGXCwjb/2qsEfzhzRNUK1Dpy
Dn8AoNyVoO6F3nMcacmCxeWTOC8muYYO
=UkLD
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lists.vcf
Type: text/x-vcard
Size: 316 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100617/2b0a46b1/attachment.vcf>

From tseaver at palladion.com  Thu Jun 17 14:14:45 2010
From: tseaver at palladion.com (Tres Seaver)
Date: Thu, 17 Jun 2010 08:14:45 -0400
Subject: [Catalog-sig] PyPI template improvements
In-Reply-To: <4C194755.2060704@v.loewis.de>
References: <AAD06C81-9B08-4C51-87CD-4C681D91A09F@ikanobori.jp>
	<4C194755.2060704@v.loewis.de>
Message-ID: <hvd3jl$muv$1@dough.gmane.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin v. L?wis wrote:

>> - - What are the supported browser versions by PyPI, I reckon it's
>> IE6/7/8+, Fx 2+, Opera 9+ Safari 4+?
> 
> What do you mean by "supported"? Officially supported, so that you can 
> make a help desk call if it won't work? None.
> 
> Or do you mean that the browser should be able to use the site? All of 
> them, plus any other browser you can think of, including Lynx and wget.

In web app land, "supported browsers" usually means the ones the
designer targets:  e.g., including "IE >= 7" in the list means that the
designer doesn't have to include workarounds for stupid glitches in
earlier IEs (or even test the design against those versions).

For CSS, this means that the site's appearance will be sometimes wonky
when running with an older-than-supported browser version.  Features
which depend on Javascript may not work at all, or only in degraded mode.


Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkwaEbUACgkQ+gerLs4ltQ5gSACeJwvouqmyCfKDZxDQzD27EBfk
CFkAnAlSDA63Gaw79ag4hZA4G7hwjXLU
=So/m
-----END PGP SIGNATURE-----


From tseaver at palladion.com  Thu Jun 17 14:22:54 2010
From: tseaver at palladion.com (Tres Seaver)
Date: Thu, 17 Jun 2010 08:22:54 -0400
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C19D4CA.1090304@egenix.com>
References: <4C19A308.5040806@zopyx.com> <4C19D4CA.1090304@egenix.com>
Message-ID: <hvd433$ons$1@dough.gmane.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

M.-A. Lemburg wrote:

> And lastly, uploading packages to PyPI (still) has a serious
> problem: setuptools doesn't know the distinction between
> UCS2 and UCS4, so uploading eggs for Unix platforms doesn't
> work out in practice. setuptools also doesn't know that
> e.g. a Mac OS X fat release may still contain the right binaries
> for a non-fat build of Python.

Uploading any 'bdist_egg' build is basically a losing proposition.
Windows may be the exception, except that at least a vocal segment of
Windows PyPI users prefer 'bdist_wininst' distributions, which can also
be consumed by setuptools / distribute.

Note however that Andreas' proposal was to require that 'sdists' be
uploaded.  I personally won't use binary-only packages, but it has
historically been true that PyPI was intended to support them, as well
as to support registration of packages hosted offsite.  Andreas'
proposal doesn't address either of those cases.


Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkwaE54ACgkQ+gerLs4ltQ7uBQCbBdAlRDxaiyWZNN3esR5GG/An
ZfsAnR83RqzGIx6hO+Ni+eZs2e1U0xkr
=Z1kG
-----END PGP SIGNATURE-----


From lists at zopyx.com  Thu Jun 17 14:26:33 2010
From: lists at zopyx.com (Andreas Jung)
Date: Thu, 17 Jun 2010 14:26:33 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <hvd433$ons$1@dough.gmane.org>
References: <4C19A308.5040806@zopyx.com> <4C19D4CA.1090304@egenix.com>
	<hvd433$ons$1@dough.gmane.org>
Message-ID: <4C1A1479.80909@zopyx.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Tres Seaver wrote:
> 
> Note however that Andreas' proposal was to require that 'sdists' be
> uploaded.  I personally won't use binary-only packages, but it has
> historically been true that PyPI was intended to support them, as well
> as to support registration of packages hosted offsite.  Andreas'
> proposal doesn't address either of those cases.

A more precise requirement would be:

 - upload the sdist if your package is open-source
 - upload the official distribution package if you are package
   is commercial

Basically...upload everything that you would also keep on your own
server as official distribution.

Andreas
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwaFHkACgkQCJIWIbr9KYwvJgCfW+Ar1vTYyNlDwXfuS31Jvl4M
fAsAnR9exynFltTLE0hVwTy7QH8rxvYC
=ldIp
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lists.vcf
Type: text/x-vcard
Size: 316 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100617/c1b09266/attachment.vcf>

From benji at benjiyork.com  Thu Jun 17 14:29:49 2010
From: benji at benjiyork.com (Benji York)
Date: Thu, 17 Jun 2010 08:29:49 -0400
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
In-Reply-To: <4C1A0992.7070507@egenix.com>
References: <4C19A308.5040806@zopyx.com> <4C19D4CA.1090304@egenix.com> 
	<4C19D745.3050900@zopyx.com> <4C19DCA9.5010308@egenix.com> 
	<4C19DF6F.9050106@zopyx.com> <4C19E409.8060603@egenix.com> 
	<4C19E54F.6030203@zopyx.com> <4C19F011.6010501@egenix.com> 
	<hvctat$k9$1@dough.gmane.org> <4C19FD2A.3050801@egenix.com> 
	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com> 
	<4C1A0992.7070507@egenix.com>
Message-ID: <AANLkTikczowGSN5jgGA6l19opbzNpnyJppiVGDFVTxCV@mail.gmail.com>

On Thu, Jun 17, 2010 at 7:40 AM, M.-A. Lemburg <mal at egenix.com> wrote:
> http://pypi.python.org/simple/zc.buildout/
>
> BTW: what are all those bug links doing on the zc.buildout index page ?

PyPI scrapes all the links from the long description; for many projects
that includes a change log with links to fixed bugs.
-- 
Benji York

From ronaldoussoren at mac.com  Thu Jun 17 14:38:01 2010
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Thu, 17 Jun 2010 14:38:01 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>
References: <4C19A308.5040806@zopyx.com> <4C19D4CA.1090304@egenix.com>
	<4C19D745.3050900@zopyx.com> <4C19DCA9.5010308@egenix.com>
	<4C19DF6F.9050106@zopyx.com> <4C19E409.8060603@egenix.com>
	<4C19E54F.6030203@zopyx.com> <4C19F011.6010501@egenix.com>
	<hvctat$k9$1@dough.gmane.org> <4C19FD2A.3050801@egenix.com>
	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>
Message-ID: <FF59456C-D278-454F-959A-4725B40004F5@mac.com>


On 17 Jun, 2010, at 13:20, Patrick Gerken wrote:
> 
> Please have a look at the package in question. The only problem
> with it is that the download URL registered on PyPI no longer works.
> It redirects to the download page where you can find the source
> distribution.
> 
> And thats exactly what Andreas' argument is targeting. 
>  

Note that even a requirement to upload a package to PyPI won't reliably solve Andreas' problem, the package owner could remove a release or even the entire package.   In an ideal world there would be no reasons for removing a package, but as we don't live in such a world there are valid reasons for wanting to remove a package. One example is being sued by some organization that claims you're using their IP without a license.

Ronald
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100617/45e4aa1a/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3567 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100617/45e4aa1a/attachment-0001.bin>

From mal at egenix.com  Thu Jun 17 14:46:28 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 17 Jun 2010 14:46:28 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C1A1479.80909@zopyx.com>
References: <4C19A308.5040806@zopyx.com>
	<4C19D4CA.1090304@egenix.com>	<hvd433$ons$1@dough.gmane.org>
	<4C1A1479.80909@zopyx.com>
Message-ID: <4C1A1924.7060302@egenix.com>

Andreas Jung wrote:
> Tres Seaver wrote:
> 
>> Note however that Andreas' proposal was to require that 'sdists' be
>> uploaded.  I personally won't use binary-only packages, but it has
>> historically been true that PyPI was intended to support them, as well
>> as to support registration of packages hosted offsite.  Andreas'
>> proposal doesn't address either of those cases.
> 
> A more precise requirement would be:
> 
>  - upload the sdist if your package is open-source
>  - upload the official distribution package if you are package
>    is commercial
> 
> Basically...upload everything that you would also keep on your own
> server as official distribution.

We cannot force authors to do this. There may be other reasons
why they can't upload such things to PyPI, e.g. crypto, trademark
and copyright laws, or even corporate rules if the author is
maintaining the package as part of his or her job.

What we can do, is make it more attractive to upload distribution
files to PyPI and also to make the whole "find the right file
to download and install" story easy enough for automatic tools
to not just give up.

For that to work, we'd need to rethink the infrastructure a bit
more, though:

If more package authors start shipping egg files for
the various Unix platforms as both UCS2 and UCS4 and for 3 or
4 different Python versions and keep those files around for
several releases, we'll run into problems with having
to mirror all those download files.

We've been doing this for several years now and it's probably an
extreme example, but just as reference: we have almost 6GB of
Python archives up on our servers and that's just for ~10
packages.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 17 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                31 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From do3ccqrv at googlemail.com  Thu Jun 17 14:54:35 2010
From: do3ccqrv at googlemail.com (Patrick Gerken)
Date: Thu, 17 Jun 2010 14:54:35 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
In-Reply-To: <AANLkTinDX035xROeg1FGOm0sSAMHS6empqRJ9a4Jy2dj@mail.gmail.com>
References: <4C19A308.5040806@zopyx.com> <4C19D4CA.1090304@egenix.com> 
	<4C19D745.3050900@zopyx.com> <4C19DCA9.5010308@egenix.com> 
	<4C19DF6F.9050106@zopyx.com> <4C19E409.8060603@egenix.com> 
	<4C19E54F.6030203@zopyx.com> <4C19F011.6010501@egenix.com> 
	<hvctat$k9$1@dough.gmane.org> <4C19FD2A.3050801@egenix.com> 
	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com> 
	<4C1A0992.7070507@egenix.com>
	<AANLkTinDX035xROeg1FGOm0sSAMHS6empqRJ9a4Jy2dj@mail.gmail.com>
Message-ID: <AANLkTimSwINe1vhnKY9vOdbUxT5IGvOQJ2gIP2KFhZU5@mail.gmail.com>

On Thu, Jun 17, 2010 at 13:40, M.-A. Lemburg <mal at egenix.com> wrote:

Patrick Gerken wrote:
>

> > As a plone user who uses zc.buildout I very much prefer reliable
> downloads.
> > Its not fun
> > to search for the reason a supposedly repeatable buildout suddenly fails
> > because
> > a company decided to rename itself.
>
> It is well possible to delete package listings on PyPI. Wouldn't
> you rather be informed about this by way of an error report in
> zc.buildout than by finding that the package name has changed
> a few years later ?
>

I would prefer to have my buildout to be working. I do not always need the
newest
versions, and we have cases where customers are working with a specific
version of plone where some additional packages made backward incompatible
changes that prohibit us from using them for these clients.
So yes, I prefer working on a potentially outdated version.
During development we check regulary for new versions. We have tools for
that.


> How about only listing packages with provided source code on the simple
> interface?
> afaik buildout always uses that, so a package python-openid is visible in
> the
> end-user view, but not installable via buildout. That way nobody would
ever
> have had
> created a dependency on it in the first place.

 If such external links are a problem for zc.buildout, why don't
> you add an option to zc.buildout that prevents using such
> packages ?
>

Because I consider pypi the root cause of the problem. Not the tools.
pip also allows repeatable package sets be defining specific version
requirements. Should this then be patched too?

This is well possible by checking the /simple index entry
> for links to package download files:
>
> http://pypi.python.org/simple/python-openid/
>
> vs.
>
> http://pypi.python.org/simple/zc.buildout/
>
> BTW: what are all those bug links doing on the zc.buildout index page ?
> They look a lot like a good possibility for injecting trojans.
>

I don't know.

What about the suggestion to show all packages on pypi but not all on the
simple view?
I can imagine that having your packages advertised on pypi generates
reasonable revenue
and I am absolutely not against that.
But I am against a pypi index that can not promise to keep its advertised
packages available.
the simple index view is meant for machines, and I'd perfectly happy if
constraints
suggested by Andreas would only be applied to that simple index.

Best regards,

        Patrick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100617/37b0f6f7/attachment.html>

From steve at pearwood.info  Thu Jun 17 14:55:48 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 17 Jun 2010 22:55:48 +1000
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
In-Reply-To: <AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>
References: <4C19A308.5040806@zopyx.com> <4C19FD2A.3050801@egenix.com>
	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>
Message-ID: <201006172255.49175.steve@pearwood.info>

On Thu, 17 Jun 2010 09:20:55 pm Patrick Gerken wrote:
> Not putting the source release on pypi is just one indicator of
> crappy software.

Yeah, like that infamous example of crappy software, Numpy.

http://pypi.python.org/pypi/numpy/1.4.1




-- 
Steven D'Aprano

From do3ccqrv at googlemail.com  Thu Jun 17 15:10:37 2010
From: do3ccqrv at googlemail.com (Patrick Gerken)
Date: Thu, 17 Jun 2010 15:10:37 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
In-Reply-To: <201006172255.49175.steve@pearwood.info>
References: <4C19A308.5040806@zopyx.com> <4C19FD2A.3050801@egenix.com> 
	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com> 
	<201006172255.49175.steve@pearwood.info>
Message-ID: <AANLkTilo6mcjGO7aW0Z3diXy8J_zQM63-pDwywTPiEp0@mail.gmail.com>

On Thu, Jun 17, 2010 at 14:55, Steven D'Aprano <steve at pearwood.info> wrote:

> On Thu, 17 Jun 2010 09:20:55 pm Patrick Gerken wrote:
> > Not putting the source release on pypi is just one indicator of
> > crappy software.
>
> Yeah, like that infamous example of crappy software, Numpy.
>
> http://pypi.python.org/pypi/numpy/1.4.1
>

I am sorry if I offended you. I do not call every software that does not
release sources
on pypi crappy. I also don't call numpy crappy.

Now, please tell me what you would do if sourceforge changes its url and
returns a
404 on the old download page. Would you update all release informations?
If not, the next time I run a buildout where the configuration requires
numpy in an old version
and the download link is broken, my buildout breaks too. And there might be
reasons why
I stick to a specific older version.
Thats what I would like to avoid.

Best regards,

           Patrick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100617/7d90fcff/attachment.html>

From mal at egenix.com  Thu Jun 17 15:16:15 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 17 Jun 2010 15:16:15 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <AANLkTikczowGSN5jgGA6l19opbzNpnyJppiVGDFVTxCV@mail.gmail.com>
References: <4C19A308.5040806@zopyx.com> <4C19D4CA.1090304@egenix.com>
	<4C19D745.3050900@zopyx.com> <4C19DCA9.5010308@egenix.com>
	<4C19DF6F.9050106@zopyx.com> <4C19E409.8060603@egenix.com>
	<4C19E54F.6030203@zopyx.com> <4C19F011.6010501@egenix.com>
	<hvctat$k9$1@dough.gmane.org> <4C19FD2A.3050801@egenix.com>
	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>
	<4C1A0992.7070507@egenix.com>
	<AANLkTikczowGSN5jgGA6l19opbzNpnyJppiVGDFVTxCV@mail.gmail.com>
Message-ID: <4C1A201F.6080609@egenix.com>

Benji York wrote:
> On Thu, Jun 17, 2010 at 7:40 AM, M.-A. Lemburg <mal at egenix.com> wrote:
>> http://pypi.python.org/simple/zc.buildout/
>>
>> BTW: what are all those bug links doing on the zc.buildout index page ?
> 
> PyPI scrapes all the links from the long description; for many projects
> that includes a change log with links to fixed bugs.

Isn't that dangerous ?

AFAIK, setuptools would start opening all those URLs and might
find download files which are not necessarily under full control of
the author, e.g. anyone could add a comment to a bug report or
wiki page with a link to an egg file on some rogue server.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 17 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                31 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From ben+python at benfinney.id.au  Thu Jun 17 16:00:11 2010
From: ben+python at benfinney.id.au (Ben Finney)
Date: Fri, 18 Jun 2010 00:00:11 +1000
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
References: <4C19A308.5040806@zopyx.com> <4C19D4CA.1090304@egenix.com>
	<4C19D745.3050900@zopyx.com>
Message-ID: <874oh1rc7o.fsf@benfinney.id.au>

Andreas Jung <lists at zopyx.com> writes:

> M.-A. Lemburg wrote:
> > You'd outrule commercial packages that don't come with a source
> > distribution. PyPI is for everyone, not only for open source
> > packages.
>
> Commercial package are a special case - I agree. The majority of all
> PyPI are non-commercial.

That's irrelevant to whether an sdist is uploaded.

Rather, the majority of PyPI packages are free software; whether they
are commercial or not is a separate dimension.

Commercial is not the opposite of free; proprietary is the opposite of
free. Commercial and proprietary are not at all the same thing.

-- 
 \       ?Facts are meaningless. You could use facts to prove anything |
  `\                that's even remotely true!? ?Homer, _The Simpsons_ |
_o__)                                                                  |
Ben Finney
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100618/d9e48b0d/attachment.pgp>

From mark at geek.net  Thu Jun 17 16:15:06 2010
From: mark at geek.net (Mark Ramm)
Date: Thu, 17 Jun 2010 10:15:06 -0400
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
In-Reply-To: <4C19C7A0.9080800@v.loewis.de>
References: <4C19A308.5040806@zopyx.com>
	<4C19C7A0.9080800@v.loewis.de>
Message-ID: <AANLkTikcyVM8BsJK7n9uDNqSrOXDCgjevCYycBw3xH6L@mail.gmail.com>

This would also impact projects like turbogears (perhaps we're the
only one, I don't know) that point to our own pypi compatable index
with the download URL.   We do this because then we can fix things
like packages with no windows eggs, packages that are broken on PyPi
or whatever.   And to help control which versions of which packages
get installed by settuptools/distribute when you easy_install tg.

I'm fine with putting sdists up on pypi, but still want people to be
downloading files from our controlled index by default where possible.

--Mark Ramm

On Thu, Jun 17, 2010 at 2:58 AM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> I propose a policy change for packages registered with PyPI:
>>
>> ?- packages registered on PyPI have at least one release
>>
>> ?- one release of registered package on PyPI _must_ contain
>> ? ?a valid source code distribution (sdist)
>>
>> ?- packages registered on PyPI without releases or without
>> ? ?source code release are subject to be removed after N days
>> ? ?after the day of registration
>
> So how would you implement that policy change? Please propose a phased
> approach, that gives affected people plenty of options to intervene if
> they disagree with the policy.
>
> Regards,
> Martin
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>

From tseaver at palladion.com  Thu Jun 17 16:59:37 2010
From: tseaver at palladion.com (Tres Seaver)
Date: Thu, 17 Jun 2010 10:59:37 -0400
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <AANLkTikcyVM8BsJK7n9uDNqSrOXDCgjevCYycBw3xH6L@mail.gmail.com>
References: <4C19A308.5040806@zopyx.com>	<4C19C7A0.9080800@v.loewis.de>
	<AANLkTikcyVM8BsJK7n9uDNqSrOXDCgjevCYycBw3xH6L@mail.gmail.com>
Message-ID: <hvdd8p$tmb$1@dough.gmane.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Mark Ramm wrote:
> This would also impact projects like turbogears (perhaps we're the
> only one, I don't know) that point to our own pypi compatable index
> with the download URL.

Your *index* is the download URL, or the tarball in the index?

> We do this because then we can fix things
> like packages with no windows eggs, packages that are broken on PyPi
> or whatever.   And to help control which versions of which packages
> get installed by settuptools/distribute when you easy_install tg.
> 
> I'm fine with putting sdists up on pypi, but still want people to be
> downloading files from our controlled index by default where possible.

Exactly.  Anybody who says "repeatable deployment" and "install from
PyPI" in the same breath is fooling themselves already.

- - People rename projects on PyPI.

- - People remove distributions from PyPI.

- - People *replace* distributions on PyPI.

All of which make it impossible to reliably and repeatably deploy
arbitrary software configurations (directly) from PyPI.  Managing your
own project-specific index is the only real solution.

Gonna-shoot-the-next-programmer-who-tells-me-don't-make-me-think'ly



Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkwaOFQACgkQ+gerLs4ltQ7m4gCeMm5iCTBsZnLIFAY92ivjSs+f
uXcAn0NCff1qBu2HscoJzmfB/kQ7v7sA
=d2HM
-----END PGP SIGNATURE-----


From steve at pearwood.info  Thu Jun 17 17:11:01 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 18 Jun 2010 01:11:01 +1000
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
In-Reply-To: <hvcea7$cb5$1@dough.gmane.org>
References: <4C19A308.5040806@zopyx.com> <hvcea7$cb5$1@dough.gmane.org>
Message-ID: <201006180111.02363.steve@pearwood.info>

On Thu, 17 Jun 2010 04:11:19 pm Christian Zagrodnick wrote:
> On 2010-06-17 06:22:32 +0200, Andreas Jung <lists at zopyx.com> said:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > Hi there,
> >
> > I propose a policy change for packages registered with PyPI:
> >
> >  - packages registered on PyPI have at least one release
> > 
> >  - one release of registered package on PyPI _must_ contain
> >    a valid source code distribution (sdist)

-1000

Please take your religious wars elsewhere. Python might be open source 
software, but there is no requirement that only open source software 
can be written in Python, and PyPI is for all Python developers, not 
just FOSS developers.



> >  - packages registered on PyPI without releases or without
> >    source code release are subject to be removed after N days
> >    after the day of registration
> >
> > Why?
> >
> > Any package registered on PyPI is possibly crucial to any kind of
> > development and deployment.

Just because it's crucial to you doesn't mean you own it and can dictate 
what the package owner does with it.

The important question here is, who controls the package? Is it the 
package owner, or PyPI? Your proposal is to give control over the 
package to PyPI rather than the owner and strip the developer of 
control in return for indexing the package on PyPI. Not only is that in 
my opinion rude and unethical, but I expect it will lead to a lot of 
authors abandoning PyPI. Instead of being the one obvious place to 
index Python packages, this proposal will fragment the package space. 
Not where the packages are hosted, but where they are indexed.


> > Packages hosted on external servers (referenced through a
> > download_url) are subject to come and go - packages once released
> > should be available at any time from a well-known location (PyPI).

And packages that are crucial to development should be bug-free, so 
perhaps we should ban packages that contain bugs too?


> > Dependencies on the availability of external downloads servers
> > other than PyPI are hardly acceptable for real-world development
> > and deployments.
>
> I second that. External download URLs are really a pain.

Then don't use them. Problem solved.


> I don't think that removing packages that way would really solve the
> problem. I think the core is:
>
> * Require the package to have a source dist *on* PyPI
> * Forbid removing any source package.

You would FORBID the package author from removing his or her own 
package? Whiskey-Tango-Foxtrot.

There are all sorts of reasons, some good, some bad, why an author might 
decide to remove his package from public distribution. What gives you 
the right to decide that he should be prohibited from doing so?



-- 
Steven D'Aprano

From l at lrowe.co.uk  Thu Jun 17 18:19:03 2010
From: l at lrowe.co.uk (Laurence Rowe)
Date: Thu, 17 Jun 2010 09:19:03 -0700 (PDT)
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C1A0992.7070507@egenix.com>
References: <4C19A308.5040806@zopyx.com> <4C19D4CA.1090304@egenix.com>
	<4C19D745.3050900@zopyx.com> <4C19DCA9.5010308@egenix.com>
	<4C19DF6F.9050106@zopyx.com> <4C19E409.8060603@egenix.com>
	<4C19E54F.6030203@zopyx.com> <4C19F011.6010501@egenix.com>
	<hvctat$k9$1@dough.gmane.org> <4C19FD2A.3050801@egenix.com>
	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>
	<4C1A0992.7070507@egenix.com>
Message-ID: <28916555.post@talk.nabble.com>




M.-A. Lemburg wrote:
> 
> If such external links are a problem for zc.buildout, why don't
> you add an option to zc.buildout that prevents using such
> packages ?
> 
> This is well possible by checking the /simple index entry
> for links to package download files:
> 
> http://pypi.python.org/simple/python-openid/
> 
> vs.
> 
> http://pypi.python.org/simple/zc.buildout/
> 
> BTW: what are all those bug links doing on the zc.buildout index page ?
> They look a lot like a good possibility for injecting trojans.
> 

That's an artefact of setuptools looking for downloadable packages from the
download_url or any url linked from the description. If all packages were
uploaded to pypi, the simple index would be much simpler.

Laurence
-- 
View this message in context: http://old.nabble.com/-Proposal--Registered-packages-must-provide-the-source-code-distribution-on-PyPI-tp28910327p28916555.html
Sent from the Python - catalog-sig mailing list archive at Nabble.com.


From l at lrowe.co.uk  Thu Jun 17 18:37:22 2010
From: l at lrowe.co.uk (Laurence Rowe)
Date: Thu, 17 Jun 2010 09:37:22 -0700 (PDT)
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C19A308.5040806@zopyx.com>
References: <4C19A308.5040806@zopyx.com>
Message-ID: <28916768.post@talk.nabble.com>



Andreas Jung-5 wrote:
> 
> Hi there,
> 
> I propose a policy change for packages registered with PyPI:
> 
>  - packages registered on PyPI have at least one release
> 
>  - one release of registered package on PyPI _must_ contain
>    a valid source code distribution (sdist)
> 
>  - packages registered on PyPI without releases or without
>    source code release are subject to be removed after N days
>    after the day of registration
> 
> Why?
> 
> Any package registered on PyPI is possibly crucial to any kind of
> development and deployment.
> 
> Packages hosted on external servers (referenced through a download_url)
> are subject to come and go - packages once released should be available
> at any time from a well-known location (PyPI). Dependencies on the
> availability of external downloads servers other than PyPI are hardly
> acceptable for real-world development and deployments.
> 
> As an example: the Plone CMS buildouts depend on python-openid.
> This package is registered with PyPI
> 
> http://pypi.python.org/pypi/python-openid
> 
> but references to
> 
> http://openidenabled.com/files/python-openid/packages/python-openid-2.2.4.tar.gz
> 
> For whatever reason the download URL is no longer working. In fact:
> openidenabled.com now points to http://www.janrain.com.
> 
> Other reasons for disappearing package in the past:
> 
>  - network or server outages of external servers
>  - users changed their organization and the organization removed
>    content of their former employees
> 
> PyPI is a valuable and crucial resource for Python development.
> It must be kept up-to-date and consistent.
> 
> I don't care about the arguments that were made in the past against
> stronger rules ("openness" etc.).
> 
> There are a lot of Python programmers around that are not Python geeks
> as most of us are and they just become pissed of when packages come and
> go or are not in the place where one would expect them.
> 
> PyPI is a community resource - but community does not mean anarchy where
> everyone should be able to upload its package crap without looking left
> and right and having the community and its needs in mind.
> 
> PyPI must become a stable package index. Everything registered with PyPI
> must be available at any time (mirrors, distributing PyPI in the
> cloud...).
> 

While I agree it would be great if we could enforce source packages being
uploaded to pypi (at least for open source packages), agreement on this is
looking unlikely.

What us buildout users really want is for the simple index to contain a copy
of the uploaded files (or at least the source packages). Instead of creating
links to other referenced urls in the simple index, setuptools / distribute
could be used to fetch the package and store  a copy. A flag could be set on
indexed proprietary packages to exclude them from the simple index.

There would seem to be a great benefit to doing this centrally and mirroring
out the result rather than multiple companies maintaining their own
individual pypi mirrors.

Laurence
-- 
View this message in context: http://old.nabble.com/-Proposal--Registered-packages-must-provide-the-source-code-distribution-on-PyPI-tp28910327p28916768.html
Sent from the Python - catalog-sig mailing list archive at Nabble.com.


From lists at zopyx.com  Thu Jun 17 18:53:41 2010
From: lists at zopyx.com (Andreas Jung)
Date: Thu, 17 Jun 2010 18:53:41 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <FF59456C-D278-454F-959A-4725B40004F5@mac.com>
References: <4C19A308.5040806@zopyx.com>
	<4C19D4CA.1090304@egenix.com>	<4C19D745.3050900@zopyx.com>
	<4C19DCA9.5010308@egenix.com>	<4C19DF6F.9050106@zopyx.com>
	<4C19E409.8060603@egenix.com>	<4C19E54F.6030203@zopyx.com>
	<4C19F011.6010501@egenix.com>	<hvctat$k9$1@dough.gmane.org>
	<4C19FD2A.3050801@egenix.com>	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>
	<FF59456C-D278-454F-959A-4725B40004F5@mac.com>
Message-ID: <4C1A5315.6000501@zopyx.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ronald Oussoren wrote:
> 
> On 17 Jun, 2010, at 13:20, Patrick Gerken wrote:
>>
>>
>>     Please have a look at the package in question. The only problem
>>     with it is that the download URL registered on PyPI no longer works.
>>     It redirects to the download page where you can find the source
>>     distribution.
>>
>>
>> And thats exactly what Andreas' argument is targeting.
>>  
> 
> Note that even a requirement to upload a package to PyPI won't reliably
> solve Andreas' problem, the package owner could remove a release or even
> the entire package.  

Released is released. There are only very few cases where one should be
allowed to remove packages (e.g. containing viruses, malware etc.).
Otherwise released stuff must not be touched.

- -aj
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwaUxUACgkQCJIWIbr9KYxmnACaAwDSSRLdU4wViW+Bql6sKMmt
XXkAoLSsgw7A5BIizfZcEqM9WxqnT2+C
=j+F8
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lists.vcf
Type: text/x-vcard
Size: 316 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100617/d6906180/attachment.vcf>

From lists at zopyx.com  Thu Jun 17 18:57:18 2010
From: lists at zopyx.com (Andreas Jung)
Date: Thu, 17 Jun 2010 18:57:18 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C1A1924.7060302@egenix.com>
References: <4C19A308.5040806@zopyx.com>
	<4C19D4CA.1090304@egenix.com>	<hvd433$ons$1@dough.gmane.org>
	<4C1A1479.80909@zopyx.com> <4C1A1924.7060302@egenix.com>
Message-ID: <4C1A53EE.6030806@zopyx.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

M.-A. Lemburg wrote:
> Andreas Jung wrote:
>> Tres Seaver wrote:
>>
>>> Note however that Andreas' proposal was to require that 'sdists' be
>>> uploaded.  I personally won't use binary-only packages, but it has
>>> historically been true that PyPI was intended to support them, as well
>>> as to support registration of packages hosted offsite.  Andreas'
>>> proposal doesn't address either of those cases.
>> A more precise requirement would be:
>>
>>  - upload the sdist if your package is open-source
>>  - upload the official distribution package if you are package
>>    is commercial
>>
>> Basically...upload everything that you would also keep on your own
>> server as official distribution.
> 
> We cannot force authors to do this. There may be other reasons
> why they can't upload such things to PyPI, e.g. crypto, trademark
> and copyright laws, or even corporate rules if the author is
> maintaining the package as part of his or her job.

You are once again talking about edge cases. In general the majority of
all externally hosted packages are not affected by such issues and
should be hosted on PyPI.

- -aj

Everything that is currently available on external

> 
> If more package authors start shipping egg files for
> the various Unix platforms as both UCS2 and UCS4 and for 3 or
> 4 different Python versions and keep those files around for
> several releases, we'll run into problems with having
> to mirror all those download files.

There is in general zero need for uploading eggs for various
Python versions if the module is Python only. I have seen packages
with upload for Python 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 3.0, 3.1 for
Python-only packages. This is really nonsense...a single sdist
is usally good enough...I bring it to the point: a bunch of Python
developer have no idea about package hygiene and use PyPI as package toilet.

- -aj


- -- 
ZOPYX Limited           | zopyx group
Charlottenstr. 37/1     | The full-service network for Zope & Plone
D-72070 T?bingen        | Produce & Publish
www.zopyx.com           | www.produce-and-publish.com
- ------------------------------------------------------------------------
E-Publishing, Python, Zope & Plone development, Consulting


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwaU+4ACgkQCJIWIbr9KYz2xQCg5HSoNn0Niim6HLA7Q3vtPkzu
0jQAoLo2lovtteUjEl/1Tj8Pxiyec9Th
=aN8k
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lists.vcf
Type: text/x-vcard
Size: 316 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100617/7c102885/attachment.vcf>

From lists at zopyx.com  Thu Jun 17 18:58:31 2010
From: lists at zopyx.com (Andreas Jung)
Date: Thu, 17 Jun 2010 18:58:31 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <201006172255.49175.steve@pearwood.info>
References: <4C19A308.5040806@zopyx.com>
	<4C19FD2A.3050801@egenix.com>	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>
	<201006172255.49175.steve@pearwood.info>
Message-ID: <4C1A5437.4090804@zopyx.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Steven D'Aprano wrote:
> On Thu, 17 Jun 2010 09:20:55 pm Patrick Gerken wrote:
>> Not putting the source release on pypi is just one indicator of
>> crappy software.
> 
> Yeah, like that infamous example of crappy software, Numpy.
> 
> http://pypi.python.org/pypi/numpy/1.4.1
> 

What's wrong with this package? It seems properly packaged, has proper
metadata....?

- -aj
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwaVDcACgkQCJIWIbr9KYxIBgCg5GGMxE2dd5MIxzRcrsYP9OAV
zSIAoIOUgBxT4PRuwLFrwhggZIJhdn6+
=YmPy
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lists.vcf
Type: text/x-vcard
Size: 316 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100617/b8fc3e16/attachment.vcf>

From mal at egenix.com  Thu Jun 17 19:21:45 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 17 Jun 2010 19:21:45 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C1A53EE.6030806@zopyx.com>
References: <4C19A308.5040806@zopyx.com>	<4C19D4CA.1090304@egenix.com>	<hvd433$ons$1@dough.gmane.org>	<4C1A1479.80909@zopyx.com>
	<4C1A1924.7060302@egenix.com> <4C1A53EE.6030806@zopyx.com>
Message-ID: <4C1A59A9.7030204@egenix.com>

Andreas Jung wrote:
> M.-A. Lemburg wrote:
>> Andreas Jung wrote:
>>> Tres Seaver wrote:
>>>
>>>> Note however that Andreas' proposal was to require that 'sdists' be
>>>> uploaded.  I personally won't use binary-only packages, but it has
>>>> historically been true that PyPI was intended to support them, as well
>>>> as to support registration of packages hosted offsite.  Andreas'
>>>> proposal doesn't address either of those cases.
>>> A more precise requirement would be:
>>>
>>>  - upload the sdist if your package is open-source
>>>  - upload the official distribution package if you are package
>>>    is commercial
>>>
>>> Basically...upload everything that you would also keep on your own
>>> server as official distribution.
> 
>> We cannot force authors to do this. There may be other reasons
>> why they can't upload such things to PyPI, e.g. crypto, trademark
>> and copyright laws, or even corporate rules if the author is
>> maintaining the package as part of his or her job.
> 
> You are once again talking about edge cases. In general the majority of
> all externally hosted packages are not affected by such issues and
> should be hosted on PyPI.

Well, there's certainly some reason why the authors chose
not to host on PyPI. I can only list a few.

>> If more package authors start shipping egg files for
>> the various Unix platforms as both UCS2 and UCS4 and for 3 or
>> 4 different Python versions and keep those files around for
>> several releases, we'll run into problems with having
>> to mirror all those download files.
> 
> There is in general zero need for uploading eggs for various
> Python versions if the module is Python only. I have seen packages
> with upload for Python 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 3.0, 3.1 for
> Python-only packages. This is really nonsense...a single sdist
> is usally good enough...I bring it to the point: a bunch of Python
> developer have no idea about package hygiene and use PyPI as package toilet.

If you ship Python-only packages with precompiled .pyc/.pyo
files, you do need to upload one version per Python version.
The marshal format and pyc magic often changes between releases.

Some developers probably don't know that if they switch off
the pyc compilation step, they'd get a single .egg file for
all Python versions they support. In that case, we'd need
to educate them, not call them names.

If you want more people to upload and host their packages
on PyPI, you have to:

 * make PyPI itself more robust and stable (we're working on that)

 * improve the tools to make both uploads and downloads
   easier (perhaps you could help with this)

 * convince people that their code is in good hands on PyPI
   (we'd need to get the PyPI terms straightened to help with
   this part)

Suggesting that they can never remove a release from PyPI
or are not allowed to rename a package is not going to
attract more developers to PyPI.

Calling them names, suggesting that their software is crap
or that they use PyPI as dump, isn't going to attract
anyone either.

Anyway, I think I've said everything I wanted to say about
this.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 17 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                31 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From lists at zopyx.com  Thu Jun 17 19:40:29 2010
From: lists at zopyx.com (Andreas Jung)
Date: Thu, 17 Jun 2010 19:40:29 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C1A59A9.7030204@egenix.com>
References: <4C19A308.5040806@zopyx.com>	<4C19D4CA.1090304@egenix.com>	<hvd433$ons$1@dough.gmane.org>	<4C1A1479.80909@zopyx.com>
	<4C1A1924.7060302@egenix.com> <4C1A53EE.6030806@zopyx.com>
	<4C1A59A9.7030204@egenix.com>
Message-ID: <4C1A5E0D.7060102@zopyx.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

M.-A. Lemburg wrote:

> If you ship Python-only packages with precompiled .pyc/.pyo
> files, you do need to upload one version per Python version.
> The marshal format and pyc magic often changes between releases.

Once again: I am talking about the majority of packages that are neither
commercial nor shipping without the Python source code.


> 
>  * make PyPI itself more robust and stable (we're working on that)

PyPI is pretty robust and this has nothing to do with packages hosted
externally.

> 
>  * improve the tools to make both uploads and downloads
>    easier (perhaps you could help with this)

What can be easier than

python setup.py register upload

?

Uploading a package to your own server is likely more complicated than
an upload to PyPI.

> 
> Suggesting that they can never remove a release from PyPI
> or are not allowed to rename a package is not going to
> attract more developers to PyPI.

I would not care about such developers. Someone renaming or removing a
release and (intentionally breaking) the setup of other people acts
irresponsible.

The basic question is: do we want PyPI being a reliable and valuable
community resource or a partly unflushed package toilet?

Andreas

- -- 
ZOPYX Limited           | zopyx group
Charlottenstr. 37/1     | The full-service network for Zope & Plone
D-72070 T?bingen        | Produce & Publish
www.zopyx.com           | www.produce-and-publish.com
- ------------------------------------------------------------------------
E-Publishing, Python, Zope & Plone development, Consulting


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwaXgsACgkQCJIWIbr9KYybiwCgvi+IexiOksr3vLgjd6CJFDym
/ooAoIvYGrXybXMVwaB/7aw7s5Wc15D4
=85d7
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lists.vcf
Type: text/x-vcard
Size: 316 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100617/ee2968bf/attachment-0001.vcf>

From mark at geek.net  Thu Jun 17 19:44:52 2010
From: mark at geek.net (Mark Ramm)
Date: Thu, 17 Jun 2010 13:44:52 -0400
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
In-Reply-To: <hvdd8p$tmb$1@dough.gmane.org>
References: <4C19A308.5040806@zopyx.com> <4C19C7A0.9080800@v.loewis.de>
	<AANLkTikcyVM8BsJK7n9uDNqSrOXDCgjevCYycBw3xH6L@mail.gmail.com>
	<hvdd8p$tmb$1@dough.gmane.org>
Message-ID: <AANLkTike8MpGVOi2PjDJAhM2zFBrDiOI_W7V6yqI0Hbc@mail.gmail.com>

> Your *index* is the download URL, or the tarball in the index?

We don't have a tarball on pypi but the Download URL points to our index:

http://pypi.python.org/pypi/TurboGears2/2.0.3

Which contains just:

Download URL: http://www.turbogears.org/2.0/downloads/2.0.3/

and easy_install TG gets tg and all it's dependencies from our specific index.

I don't care if it works in just exactly this way, but maintaining the
ability to create a controlled index is critical to making the
turbogears install process repeatable and reliable.

Also note, we have a new index url for each release -- so you'll
always be able to do a tg install for a specific version with known
working results.

--Mark Ramm

From lists at zopyx.com  Thu Jun 17 19:50:54 2010
From: lists at zopyx.com (Andreas Jung)
Date: Thu, 17 Jun 2010 19:50:54 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <AANLkTike8MpGVOi2PjDJAhM2zFBrDiOI_W7V6yqI0Hbc@mail.gmail.com>
References: <4C19A308.5040806@zopyx.com>
	<4C19C7A0.9080800@v.loewis.de>	<AANLkTikcyVM8BsJK7n9uDNqSrOXDCgjevCYycBw3xH6L@mail.gmail.com>	<hvdd8p$tmb$1@dough.gmane.org>
	<AANLkTike8MpGVOi2PjDJAhM2zFBrDiOI_W7V6yqI0Hbc@mail.gmail.com>
Message-ID: <4C1A607E.2030904@zopyx.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Mark Ramm wrote:
>> Your *index* is the download URL, or the tarball in the index?
> 
> We don't have a tarball on pypi but the Download URL points to our index:
> 
> http://pypi.python.org/pypi/TurboGears2/2.0.3
> 
> Which contains just:
> 
> Download URL: http://www.turbogears.org/2.0/downloads/2.0.3/
> 

How do you ensure the availability of the index and the packages at
any time?

- -aj
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwaYH4ACgkQCJIWIbr9KYx7wQCfaTqgtfXv7qgfLGX2TvjDB1sP
99sAoMxgyK6l7YDvbk/7Ur0IbiSsTXYJ
=YwUK
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lists.vcf
Type: text/x-vcard
Size: 316 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100617/7c5ee2fb/attachment.vcf>

From mark at geek.net  Thu Jun 17 19:56:37 2010
From: mark at geek.net (Mark Ramm)
Date: Thu, 17 Jun 2010 13:56:37 -0400
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
In-Reply-To: <4C1A607E.2030904@zopyx.com>
References: <4C19A308.5040806@zopyx.com> <4C19C7A0.9080800@v.loewis.de>
	<AANLkTikcyVM8BsJK7n9uDNqSrOXDCgjevCYycBw3xH6L@mail.gmail.com>
	<hvdd8p$tmb$1@dough.gmane.org>
	<AANLkTike8MpGVOi2PjDJAhM2zFBrDiOI_W7V6yqI0Hbc@mail.gmail.com>
	<4C1A607E.2030904@zopyx.com>
Message-ID: <AANLkTilqematflkFtlkxGtmPTHbTfjLo11rnnDKZqKi4@mail.gmail.com>

> How do you ensure the availability of the index and the packages at
> any time?

By keeping our server up, and not depending on pypi.   If our server
goes down, packages will become unavailable, but if you want a mirror
for a particular revision of tg and all it's dependencies you can just
grab a copy of

http://www.turbogears.org/2.0/downloads/2.0.3/ and host it on your own
servers at your company.

You can always use the -i
http://www.turbogears.org/2.0/downloads/2.0.3/  command to skip past
pypi completely and just use our (or if you made your own copy, your
very own) index.

We use http://pypi.python.org/pypi/basketweaver/ to make the index
once we've got a pile of eggs and tarballs in a local directory. Which
anybody with enough time can do.

--Mark Ramm

From lists at zopyx.com  Thu Jun 17 20:03:47 2010
From: lists at zopyx.com (Andreas Jung)
Date: Thu, 17 Jun 2010 20:03:47 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <AANLkTilqematflkFtlkxGtmPTHbTfjLo11rnnDKZqKi4@mail.gmail.com>
References: <4C19A308.5040806@zopyx.com>	<4C19C7A0.9080800@v.loewis.de>	<AANLkTikcyVM8BsJK7n9uDNqSrOXDCgjevCYycBw3xH6L@mail.gmail.com>	<hvdd8p$tmb$1@dough.gmane.org>	<AANLkTike8MpGVOi2PjDJAhM2zFBrDiOI_W7V6yqI0Hbc@mail.gmail.com>	<4C1A607E.2030904@zopyx.com>
	<AANLkTilqematflkFtlkxGtmPTHbTfjLo11rnnDKZqKi4@mail.gmail.com>
Message-ID: <4C1A6383.80105@zopyx.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Mark Ramm wrote:
>> How do you ensure the availability of the index and the packages at
>> any time?
> 
> By keeping our server up, and not depending on pypi.   If our server
> goes down, packages will become unavailable, but if you want a mirror
> for a particular revision of tg and all it's dependencies you can just
> grab a copy of

Would you use PyPI as download server or as primary location if it would
be more reliable or having a usuable mirroring infrastructure?

The point: of course I can create own internal mirror - but do we really
want or need that? My business is building software - not mirrors or
workarounds for a missing or unreliable package infrastructure.

Side note: just checked CPAN - CPAN has 228 official mirrors, PyPI has
no official mirrors (only four or five) inofficial mirrors as part of
the PyPI mirroring project).

Andreas
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwaY4MACgkQCJIWIbr9KYwmSACfbAbmuff4Jboy7UDcecwviTht
u9oAn35dq99B6Kqe4/YAZNuzyZ26MhU4
=cAj8
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lists.vcf
Type: text/x-vcard
Size: 316 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100617/b1f998aa/attachment.vcf>

From mark at geek.net  Thu Jun 17 20:15:34 2010
From: mark at geek.net (Mark Ramm)
Date: Thu, 17 Jun 2010 14:15:34 -0400
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
In-Reply-To: <4C1A6383.80105@zopyx.com>
References: <4C19A308.5040806@zopyx.com> <4C19C7A0.9080800@v.loewis.de>
	<AANLkTikcyVM8BsJK7n9uDNqSrOXDCgjevCYycBw3xH6L@mail.gmail.com>
	<hvdd8p$tmb$1@dough.gmane.org>
	<AANLkTike8MpGVOi2PjDJAhM2zFBrDiOI_W7V6yqI0Hbc@mail.gmail.com>
	<4C1A607E.2030904@zopyx.com>
	<AANLkTilqematflkFtlkxGtmPTHbTfjLo11rnnDKZqKi4@mail.gmail.com>
	<4C1A6383.80105@zopyx.com>
Message-ID: <AANLkTikU2E6yPUGSlD06WwLm9W21Oxe-LwfflydI1npr@mail.gmail.com>

> Would you use PyPI as download server or as primary location if it would
> be more reliable or having a usuable mirroring infrastructure?

No.   Because it would still drop old packages, allow people to upload
new packages and otherwise make the repeatable builds difficult.

I'm most frustrated by the dropping of old packages, but unless I lock
down things super tightly in setup.py new versions turn out to break
the tg install process often enough that we need more control than
pypi provides.

> The point: of course I can create own internal mirror - but do we really
> want or need that? My business is building software - not mirrors or
> workarounds for a missing or unreliable package infrastructure.

Well, I need it.   I've spent work implementing it, and I want it to
continue to be supported, and for my use of this feature to continue
to work.  If you think it's bad and don't want that, then fine.   But
I'm more interested in making the tools I have now work now for the
users we have now.  And making pypi more available doesn't solve my
whole problem, and the proposal at the start of the thread, makes it
worse for me.

> Side note: just checked CPAN - CPAN has 228 official mirrors, PyPI has
> no official mirrors (only four or five) inofficial mirrors as part of
> the PyPI mirroring project).

Yea, more mirrors would be better.   No doubt.

--Mark Ramm

From ianb at colorstudy.com  Thu Jun 17 20:33:58 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu, 17 Jun 2010 13:33:58 -0500
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
In-Reply-To: <AANLkTikU2E6yPUGSlD06WwLm9W21Oxe-LwfflydI1npr@mail.gmail.com>
References: <4C19A308.5040806@zopyx.com> <4C19C7A0.9080800@v.loewis.de> 
	<AANLkTikcyVM8BsJK7n9uDNqSrOXDCgjevCYycBw3xH6L@mail.gmail.com> 
	<hvdd8p$tmb$1@dough.gmane.org>
	<AANLkTike8MpGVOi2PjDJAhM2zFBrDiOI_W7V6yqI0Hbc@mail.gmail.com> 
	<4C1A607E.2030904@zopyx.com>
	<AANLkTilqematflkFtlkxGtmPTHbTfjLo11rnnDKZqKi4@mail.gmail.com> 
	<4C1A6383.80105@zopyx.com>
	<AANLkTikU2E6yPUGSlD06WwLm9W21Oxe-LwfflydI1npr@mail.gmail.com>
Message-ID: <AANLkTinwQRX6dOUq559nYpEX028zTj5-Bfg4I1giDvds@mail.gmail.com>

On Thu, Jun 17, 2010 at 1:15 PM, Mark Ramm <mark at geek.net> wrote:

> > Would you use PyPI as download server or as primary location if it would
> > be more reliable or having a usuable mirroring infrastructure?
>
> No.   Because it would still drop old packages, allow people to upload
> new packages and otherwise make the repeatable builds difficult.
>

It does?  I thought PyPI kept everything around (but hidden) unless the
author went in and manually deleted old stuff.  You just need to go to a
deep link, e.g., http://pypi.python.org/pypi/SomePackage/0.1

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100617/6582e4ff/attachment-0001.html>

From jess.austin at gmail.com  Thu Jun 17 21:58:17 2010
From: jess.austin at gmail.com (Jess Austin)
Date: Thu, 17 Jun 2010 14:58:17 -0500
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
Message-ID: <AANLkTikpajF82Un0JQ4aaGSOZE0kdRsV41BbWiDikyIE@mail.gmail.com>

On Thu, Jun 17, 2010 at 12:40 PM, Andreas Jung <lists at zopyx.com> wrote:
> Once again: I am talking about the majority of packages that are neither
> commercial nor shipping without the Python source code.

This seems to say either that you don't care about the supposed
minority of packages that are "justified" in not releasing or in
removing sources, or that it will be easy to differentiate between
such packages and the remainder of the packages that are to suffer
your procrustean rules.  I don't accept, and you certainly haven't
made any arguments to support, either of those propositions.


>> Suggesting that they can never remove a release from PyPI
>> or are not allowed to rename a package is not going to
>> attract more developers to PyPI.
>
> I would not care about such developers. Someone renaming or removing a
> release and (intentionally breaking) the setup of other people acts
> irresponsible.
>
> The basic question is: do we want PyPI being a reliable and valuable
> community resource or a partly unflushed package toilet?

Stipulated, you are unabashed in your lack of care for the needs of
other PyPI users, for whom PyPI is already a valuable resource.  In
response, a question: is there anyone who supports this radical policy
change who is NOT a zc.buildout user?

Previously in this thread, there have been several plausible
suggestions for modifying (improving?) zc.buildout to cope with the
issues you've identified.  Have you relayed these suggestions to the
zc.buildout developers and administrators?  Do you know for a fact
that zc.buildout can't be fixed?  If so, perhaps it should be removed
from PyPI; I certainly wouldn't want to rely on it.

cheers,
Jess

From kevin at bud.ca  Thu Jun 17 22:18:52 2010
From: kevin at bud.ca (Kevin Teague)
Date: Thu, 17 Jun 2010 13:18:52 -0700
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
In-Reply-To: <AANLkTikpajF82Un0JQ4aaGSOZE0kdRsV41BbWiDikyIE@mail.gmail.com>
References: <AANLkTikpajF82Un0JQ4aaGSOZE0kdRsV41BbWiDikyIE@mail.gmail.com>
Message-ID: <AANLkTikiXZPV8xnzQm2mwhDe8OdoREoqtpB6bCgX6lFh@mail.gmail.com>

> Previously in this thread, there have been several plausible
> suggestions for modifying (improving?) zc.buildout to cope with the
> issues you've identified.  Have you relayed these suggestions to the
> zc.buildout developers and administrators?  Do you know for a fact
> that zc.buildout can't be fixed?  If so, perhaps it should be removed
> from PyPI; I certainly wouldn't want to rely on it.
>
>
Didn't Setuptools/easy_install began this policy of following the
download_url from PyPI's early days when it wasn't even possible to upload
to PyPI (or at least during the transition when a majority of packages only
provided download_urls). easy_install has been repeatedly critiqued for this
behaviour.

Can anyone say why pip and buildout follow this policy? Has there been any
thought to changing the install tools themselves?

I know that relying on PyPI doesn't give 100% repeatability, but it does
tend much more towards repeatability than following download_urls. I know
I'd much rather prefer that these tools require a flag to use this
behaviour, since many initially assume that these tools only download from
an index and find it quite unexpected that they'll follow links to other
servers.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100617/d9a32755/attachment.html>

From martin at v.loewis.de  Thu Jun 17 22:44:24 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 17 Jun 2010 22:44:24 +0200
Subject: [Catalog-sig] PyPI template improvements
In-Reply-To: <hvd3jl$muv$1@dough.gmane.org>
References: <AAD06C81-9B08-4C51-87CD-4C681D91A09F@ikanobori.jp>	<4C194755.2060704@v.loewis.de>
	<hvd3jl$muv$1@dough.gmane.org>
Message-ID: <4C1A8928.8090709@v.loewis.de>

> In web app land, "supported browsers" usually means the ones the
> designer targets:  e.g., including "IE>= 7" in the list means that the
> designer doesn't have to include workarounds for stupid glitches in
> earlier IEs (or even test the design against those versions).
>
> For CSS, this means that the site's appearance will be sometimes wonky
> when running with an older-than-supported browser version.  Features
> which depend on Javascript may not work at all, or only in degraded mode.

I have a really hard time answering that question then: there was no web 
designer involved in creating PyPI (*). The browser that the
*authors* of the service target are really the ones I mentioned: all of 
them.

There is one browser that gets special attention, and flaws relating to 
it get fixed faster than for any other browser: setuptools.

Regards,
Martin

(*) of course, it uses the layout of python.org, which did have a web 
designer; for this design, I don't know the answer.

From ianb at colorstudy.com  Thu Jun 17 22:54:29 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu, 17 Jun 2010 15:54:29 -0500
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
In-Reply-To: <AANLkTikiXZPV8xnzQm2mwhDe8OdoREoqtpB6bCgX6lFh@mail.gmail.com>
References: <AANLkTikpajF82Un0JQ4aaGSOZE0kdRsV41BbWiDikyIE@mail.gmail.com> 
	<AANLkTikiXZPV8xnzQm2mwhDe8OdoREoqtpB6bCgX6lFh@mail.gmail.com>
Message-ID: <AANLkTimeWwZJya708qKpuGy0_Qwd_02Y63Em1XtPSchC@mail.gmail.com>

On Thu, Jun 17, 2010 at 3:18 PM, Kevin Teague <kevin at bud.ca> wrote:

>
> Previously in this thread, there have been several plausible
>> suggestions for modifying (improving?) zc.buildout to cope with the
>> issues you've identified.  Have you relayed these suggestions to the
>> zc.buildout developers and administrators?  Do you know for a fact
>> that zc.buildout can't be fixed?  If so, perhaps it should be removed
>> from PyPI; I certainly wouldn't want to rely on it.
>>
>>
> Didn't Setuptools/easy_install began this policy of following the
> download_url from PyPI's early days when it wasn't even possible to upload
> to PyPI (or at least during the transition when a majority of packages only
> provided download_urls). easy_install has been repeatedly critiqued for this
> behaviour.
>
> Can anyone say why pip and buildout follow this policy? Has there been any
> thought to changing the install tools themselves?
>

To the degree people have tested their installation procedures, they've
usually tested that it works with easy_install.  easy_install in turn was
written to install stuff when there was some sane way to figure out what to
install.  So the tools are largely reactive.

Putting in a hard warning (e.g., one that requires hitting enter) might be
okay for some class of problematic behavior.  Deeper searching of links
could be handled this way, though for now we'd have to actually look in
those pages and only warn if something was found... so there'd be many of
the same problems but at least a path to removing the behavior completely.

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100617/ecabbfef/attachment.html>

From martin at v.loewis.de  Thu Jun 17 23:17:26 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 17 Jun 2010 23:17:26 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C19D148.4000308@zopyx.com>
References: <4C19A308.5040806@zopyx.com> <4C19C7A0.9080800@v.loewis.de>
	<4C19CA43.9000509@zopyx.com> <4C19D061.5020303@v.loewis.de>
	<4C19D148.4000308@zopyx.com>
Message-ID: <4C1A90E6.8010304@v.loewis.de>

Am 17.06.2010 09:39, schrieb Andreas Jung:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Martin v. L?wis wrote:
>
>> IMO, it's a waste of energy: if a package is useless, just don't use it,
>> and be done. There are many packages on PyPI that are useless to me
>> despite having a source release.
>>
>
> "useless" is not the point. The "availability" matters - the
> availability of package must not depend externals servers other than an
> official PyPI server.

Why is that? You are talk about the Python Package *INDEX*. File hosting 
is an optional feature of the service.

Regards,
Martin

From martin at v.loewis.de  Thu Jun 17 23:21:53 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 17 Jun 2010 23:21:53 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C19DF6F.9050106@zopyx.com>
References: <4C19A308.5040806@zopyx.com>
	<4C19D4CA.1090304@egenix.com>	<4C19D745.3050900@zopyx.com>
	<4C19DCA9.5010308@egenix.com> <4C19DF6F.9050106@zopyx.com>
Message-ID: <4C1A91F1.3040907@v.loewis.de>

> I don't care if it has a name and a version number. I was not able
> to work on my project - other co-workers also complained...this
> is a not acceptable situation...as Python geek I can likely deal with
> that, others can't :)

Then complain to the python-openid authors. It's their fault that the 
package is unavailable, not PyPI's.

> We had such issues over and over again over the last years.
> A typical Zope/Plone installation requires over hundred different
> packages and we have seen such failures with external servers
> various times. The workaround was creating PyPI mirrors, project related
> mirrors or download caches....just workarounds but not really a reliable
> and working infrastructure..

So go through and ask the authors of all these packages to upload to 
PyPI. Some may comply, others may not.

But first and foremost: reduce the set of dependencies. I see a 
ridiculous growth in dependencies. Consider rewriting small pieces of 
code instead of depending on a huge library just for a little function.

Regards,
Martin

From martin at v.loewis.de  Thu Jun 17 23:27:49 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu, 17 Jun 2010 23:27:49 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>
References: <4C19A308.5040806@zopyx.com> <4C19D4CA.1090304@egenix.com>
	<4C19D745.3050900@zopyx.com> <4C19DCA9.5010308@egenix.com>
	<4C19DF6F.9050106@zopyx.com> <4C19E409.8060603@egenix.com>
	<4C19E54F.6030203@zopyx.com> <4C19F011.6010501@egenix.com>
	<hvctat$k9$1@dough.gmane.org> <4C19FD2A.3050801@egenix.com>
	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>
Message-ID: <4C1A9355.2030807@v.loewis.de>

> I see a point in that, but what is more important, having a catalog to
> browse or having a reliable repository of software to download?

It's the Python Package Index, so clearly, the catalog function is more 
important than the reliable repository function. People use PyPI to find 
out whether a Python module for a certain problem exists.

Only some of the users use it to automatically download from it in a 
regular manner.

> How about only listing packages with provided source code on the simple
> interface?

If you, as a user, have a policy to not use packages which you can't 
download from PyPI, can't you just ignore those packages when browsing?

> afaik buildout always uses that, so a package python-openid is visible
> in the
> end-user view, but not installable via buildout. That way nobody would
> ever have had
> created a dependency on it in the first place.

Apparently, whoever created the dependency to python-openid didn't worry 
about this specific issue.

FWIW, I evaluated python-openid, and found that it's better to rewrite 
it than to reuse it (regardless of where it's hosted).

Regards,
Martin

From martin at v.loewis.de  Thu Jun 17 23:30:19 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 17 Jun 2010 23:30:19 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C1A0D3C.4050402@zopyx.com>
References: <4C19A308.5040806@zopyx.com>	<4C19D4CA.1090304@egenix.com>	<4C19D745.3050900@zopyx.com>	<4C19DCA9.5010308@egenix.com>	<4C19DF6F.9050106@zopyx.com>	<4C19E409.8060603@egenix.com>	<4C19E54F.6030203@zopyx.com>	<4C19F011.6010501@egenix.com>	<hvctat$k9$1@dough.gmane.org>	<4C19FD2A.3050801@egenix.com>	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>	<4C1A0992.7070507@egenix.com>
	<4C1A0D3C.4050402@zopyx.com>
Message-ID: <4C1A93EB.9020308@v.loewis.de>

> In theory yes, in real life no - I approached several package
> maintainers in the past due to several reasons..some agree with the
> complaints, others just don't care. Some consider PyPI as their own
> private repository with their own rules and no need to care about the
> community e.g. by providing proper metadata (I call this anti-social and
> PyPI-misuse).

As the PyPI maintainer, I assure you that it is no misuse. Whether it's 
anti-social, I don't know.

So given that discussion, I'm now opposed to enforcing a policy here.
It's not a policy that all users can agree to.

Regards,
Martin

From martin at v.loewis.de  Thu Jun 17 23:32:55 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 17 Jun 2010 23:32:55 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C1A201F.6080609@egenix.com>
References: <4C19A308.5040806@zopyx.com>
	<4C19D4CA.1090304@egenix.com>	<4C19D745.3050900@zopyx.com>
	<4C19DCA9.5010308@egenix.com>	<4C19DF6F.9050106@zopyx.com>
	<4C19E409.8060603@egenix.com>	<4C19E54F.6030203@zopyx.com>
	<4C19F011.6010501@egenix.com>	<hvctat$k9$1@dough.gmane.org>
	<4C19FD2A.3050801@egenix.com>	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>	<4C1A0992.7070507@egenix.com>	<AANLkTikczowGSN5jgGA6l19opbzNpnyJppiVGDFVTxCV@mail.gmail.com>
	<4C1A201F.6080609@egenix.com>
Message-ID: <4C1A9487.5070108@v.loewis.de>

Am 17.06.2010 15:16, schrieb M.-A. Lemburg:
> Benji York wrote:
>> On Thu, Jun 17, 2010 at 7:40 AM, M.-A. Lemburg<mal at egenix.com>  wrote:
>>> http://pypi.python.org/simple/zc.buildout/
>>>
>>> BTW: what are all those bug links doing on the zc.buildout index page ?
>>
>> PyPI scrapes all the links from the long description; for many projects
>> that includes a change log with links to fixed bugs.
>
> Isn't that dangerous ?
>
> AFAIK, setuptools would start opening all those URLs and might
> find download files which are not necessarily under full control of
> the author, e.g. anyone could add a comment to a bug report or
> wiki page with a link to an egg file on some rogue server.

I think you misunderstand. Links originate *only* from the long 
description. The package owner has full control over that.

If you think the package owner is opening up a security threat by 
including the links in the first place - yes, that's indeed a risk.

Regards,
Martin

From martin at v.loewis.de  Thu Jun 17 23:35:03 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 17 Jun 2010 23:35:03 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C1A5315.6000501@zopyx.com>
References: <4C19A308.5040806@zopyx.com>	<4C19D4CA.1090304@egenix.com>	<4C19D745.3050900@zopyx.com>	<4C19DCA9.5010308@egenix.com>	<4C19DF6F.9050106@zopyx.com>	<4C19E409.8060603@egenix.com>	<4C19E54F.6030203@zopyx.com>	<4C19F011.6010501@egenix.com>	<hvctat$k9$1@dough.gmane.org>	<4C19FD2A.3050801@egenix.com>	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>	<FF59456C-D278-454F-959A-4725B40004F5@mac.com>
	<4C1A5315.6000501@zopyx.com>
Message-ID: <4C1A9507.5090302@v.loewis.de>

>> Note that even a requirement to upload a package to PyPI won't reliably
>> solve Andreas' problem, the package owner could remove a release or even
>> the entire package.
>
> Released is released. There are only very few cases where one should be
> allowed to remove packages (e.g. containing viruses, malware etc.).
> Otherwise released stuff must not be touched.

Not at all. If a package owner decides to delete a package, the package 
is completely erased from PyPI. This is how it is, and how it should be.
PyPI has no right to keep the file against the author's will.

Regards,
Martin

From martin at v.loewis.de  Thu Jun 17 23:36:39 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu, 17 Jun 2010 23:36:39 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <AANLkTilo6mcjGO7aW0Z3diXy8J_zQM63-pDwywTPiEp0@mail.gmail.com>
References: <4C19A308.5040806@zopyx.com> <4C19FD2A.3050801@egenix.com>
	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>
	<201006172255.49175.steve@pearwood.info>
	<AANLkTilo6mcjGO7aW0Z3diXy8J_zQM63-pDwywTPiEp0@mail.gmail.com>
Message-ID: <4C1A9567.1010703@v.loewis.de>

> Now, please tell me what you would do if sourceforge changes its url and
> returns a
> 404 on the old download page. Would you update all release informations?
> If not, the next time I run a buildout where the configuration requires
> numpy in an old version
> and the download link is broken, my buildout breaks too. And there might
> be reasons why
> I stick to a specific older version.
> Thats what I would like to avoid.

Maybe you should stop using buildout then, and switch to Debian 
packages. They typically get the dependencies right, and available.

Regards,
Martin

From martin at v.loewis.de  Thu Jun 17 23:41:58 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu, 17 Jun 2010 23:41:58 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <AANLkTinwQRX6dOUq559nYpEX028zTj5-Bfg4I1giDvds@mail.gmail.com>
References: <4C19A308.5040806@zopyx.com> <4C19C7A0.9080800@v.loewis.de>
	<AANLkTikcyVM8BsJK7n9uDNqSrOXDCgjevCYycBw3xH6L@mail.gmail.com>
	<hvdd8p$tmb$1@dough.gmane.org>	<AANLkTike8MpGVOi2PjDJAhM2zFBrDiOI_W7V6yqI0Hbc@mail.gmail.com>
	<4C1A607E.2030904@zopyx.com>	<AANLkTilqematflkFtlkxGtmPTHbTfjLo11rnnDKZqKi4@mail.gmail.com>
	<4C1A6383.80105@zopyx.com>	<AANLkTikU2E6yPUGSlD06WwLm9W21Oxe-LwfflydI1npr@mail.gmail.com>
	<AANLkTinwQRX6dOUq559nYpEX028zTj5-Bfg4I1giDvds@mail.gmail.com>
Message-ID: <4C1A96A6.3050101@v.loewis.de>

> It does?  I thought PyPI kept everything around (but hidden) unless the
> author went in and manually deleted old stuff.  You just need to go to a
> deep link, e.g., http://pypi.python.org/pypi/SomePackage/0.1

Sure, but owners *do* manually delete old stuff.

Regards,
Martin

From martin at v.loewis.de  Thu Jun 17 23:45:08 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 17 Jun 2010 23:45:08 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <28916768.post@talk.nabble.com>
References: <4C19A308.5040806@zopyx.com> <28916768.post@talk.nabble.com>
Message-ID: <4C1A9764.9080408@v.loewis.de>

> What us buildout users really want is for the simple index to contain a copy
> of the uploaded files (or at least the source packages). Instead of creating
> links to other referenced urls in the simple index, setuptools / distribute
> could be used to fetch the package and store  a copy. A flag could be set on
> indexed proprietary packages to exclude them from the simple index.
>
> There would seem to be a great benefit to doing this centrally and mirroring
> out the result rather than multiple companies maintaining their own
> individual pypi mirrors.

I can understand the need, but I would propose an entirely different 
solution:

Have buildout, by default, reject downloads from a different server. 
Then, when you create the dependency, you already notice the problem, 
and may chose to drop the dependency.

I don't think any policy change will force users to upload if they 
really don't want to. Instead, the major effect of the policy 
(apparently) would be that they stop registering with PyPI.

Regards,
Martin

From martin at v.loewis.de  Thu Jun 17 23:51:21 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 17 Jun 2010 23:51:21 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <AANLkTikiXZPV8xnzQm2mwhDe8OdoREoqtpB6bCgX6lFh@mail.gmail.com>
References: <AANLkTikpajF82Un0JQ4aaGSOZE0kdRsV41BbWiDikyIE@mail.gmail.com>
	<AANLkTikiXZPV8xnzQm2mwhDe8OdoREoqtpB6bCgX6lFh@mail.gmail.com>
Message-ID: <4C1A98D9.6070707@v.loewis.de>

> Didn't Setuptools/easy_install began this policy of following the
> download_url from PyPI's early days when it wasn't even possible to
> upload to PyPI (or at least during the transition when a majority of
> packages only provided download_urls).

Not sure whether this was rhetoric: yes, that's how it all started.

PyPI/the cheeseshop was originally *just* a package index, and designed 
as such. Automated downloads wheren't even considered, but the objective 
was to give people a way of registering and finding Python software 
(because the manually-maintained lists of Python software started to rot).

I added file upload at some point, primarily because people asked for it 
who didn't have any web hosting elsewhere. It was assumed that most 
packages would release somewhere to the net, and only few packages would 
use the file upload.

FWIW, the documentation upload started with the very same assumption, 
and it's probably still the case that people host documentation at PyPI 
only if they have nothing better.

Regards,
Martin

From ronaldoussoren at mac.com  Thu Jun 17 23:40:13 2010
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Thu, 17 Jun 2010 23:40:13 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C1A5315.6000501@zopyx.com>
References: <4C19A308.5040806@zopyx.com> <4C19D4CA.1090304@egenix.com>
	<4C19D745.3050900@zopyx.com> <4C19DCA9.5010308@egenix.com>
	<4C19DF6F.9050106@zopyx.com> <4C19E409.8060603@egenix.com>
	<4C19E54F.6030203@zopyx.com> <4C19F011.6010501@egenix.com>
	<hvctat$k9$1@dough.gmane.org> <4C19FD2A.3050801@egenix.com>
	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>
	<FF59456C-D278-454F-959A-4725B40004F5@mac.com>
	<4C1A5315.6000501@zopyx.com>
Message-ID: <CD923D12-F4B1-4FFA-B8A4-723A991E4916@mac.com>



On Jun 17, 2010, at 18:53, Andreas Jung <lists at zopyx.com> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Ronald Oussoren wrote:
>> 
>> On 17 Jun, 2010, at 13:20, Patrick Gerken wrote:
>>> 
>>> 
>>>    Please have a look at the package in question. The only problem
>>>    with it is that the download URL registered on PyPI no longer works.
>>>    It redirects to the download page where you can find the source
>>>    distribution.
>>> 
>>> 
>>> And thats exactly what Andreas' argument is targeting.
>>> 
>> 
>> Note that even a requirement to upload a package to PyPI won't reliably
>> solve Andreas' problem, the package owner could remove a release or even
>> the entire package.  
> 
> Released is released. There are only very few cases where one should be
> allowed to remove packages (e.g. containing viruses, malware etc.).
> Otherwise released stuff must not be touched.

I agree that it would in mist cases be better to keep releases around, but a developer might not have the option to do so for legal reasons.

And as someone else noted uploading to pypi might not be possible either for legal reasons, such as for cryptographic software.

Ronald
> 
> - -aj
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (Darwin)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
> 
> iEYEARECAAYFAkwaUxUACgkQCJIWIbr9KYxmnACaAwDSSRLdU4wViW+Bql6sKMmt
> XXkAoLSsgw7A5BIizfZcEqM9WxqnT2+C
> =j+F8
> -----END PGP SIGNATURE-----
> <lists.vcf>

From ben+python at benfinney.id.au  Fri Jun 18 01:20:35 2010
From: ben+python at benfinney.id.au (Ben Finney)
Date: Fri, 18 Jun 2010 09:20:35 +1000
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
References: <4C19A308.5040806@zopyx.com> <hvcea7$cb5$1@dough.gmane.org>
	<201006180111.02363.steve@pearwood.info>
Message-ID: <87vd9hp7p8.fsf@benfinney.id.au>

Steven D'Aprano <steve at pearwood.info> writes:

> On Thu, 17 Jun 2010 04:11:19 pm Christian Zagrodnick wrote:
> > On 2010-06-17 06:22:32 +0200, Andreas Jung <lists at zopyx.com> said:
> > >  - one release of registered package on PyPI _must_ contain
> > >    a valid source code distribution (sdist)
>
> -1000
>
> Please take your religious wars elsewhere.

Please address the substance of the proposal. It's neither religious,
nor anything to do with war.

> Python might be open source software, but there is no requirement that
> only open source software can be written in Python, and PyPI is for
> all Python developers, not just FOSS developers.

True enough. It could be otherwise, though, so the proposal is hardly
deserving of the slurs you hurled in your first paragraphs.

-- 
 \         ?We now have access to so much information that we can find |
  `\  support for any prejudice or opinion.? ?David Suzuki, 2008-06-27 |
_o__)                                                                  |
Ben Finney


From domen at dev.si  Fri Jun 18 01:29:09 2010
From: domen at dev.si (Domen =?UTF-8?Q?Ko=C5=BEar?=)
Date: Fri, 18 Jun 2010 01:29:09 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
Message-ID: <1276817349.5093.19.camel@oblak>

I'm looking at PyPi as infrastructure and upstream source for Linux
distributions.

* Renaming packages

I would strongly say NO to this one. Once you make a release, don't
change it. If mistake in metadata/packaging was done, make new release
like 1.0 -> 1.0-r1

* Source code requirement

This one really depends on the main purpose of PyPi. If it's only there
as provider of metadata garbage, then no rules should be applied. If
it's main goal is to provide downloadable package companied with
metadata, then source could be an requirement. 

Companies using PyPi as index of metadata, that's nonsense. They can
setup their own pypi mirror and that would even be a more proper way.

My 2 cents, Domen
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100618/61b5d160/attachment.pgp>

From steve at pearwood.info  Fri Jun 18 04:21:16 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 18 Jun 2010 12:21:16 +1000
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
In-Reply-To: <4C1A5437.4090804@zopyx.com>
References: <4C19A308.5040806@zopyx.com>
	<201006172255.49175.steve@pearwood.info>
	<4C1A5437.4090804@zopyx.com>
Message-ID: <201006181221.16797.steve@pearwood.info>

On Fri, 18 Jun 2010 02:58:31 am Andreas Jung wrote:
> Steven D'Aprano wrote:
> > On Thu, 17 Jun 2010 09:20:55 pm Patrick Gerken wrote:
> >> Not putting the source release on pypi is just one indicator of
> >> crappy software.
> >
> > Yeah, like that infamous example of crappy software, Numpy.
> >
> > http://pypi.python.org/pypi/numpy/1.4.1
>
> What's wrong with this package? It seems properly packaged, has
> proper metadata....?

And it is hosted on Sourceforge.

But you're right, there is nothing wrong with the package. That includes 
the fact that it's hosted external to PyPI. Why should we force numpy 
to change? Whatever their reasons for hosting on Sourceforge, it is 
their package and their choice and we should respect that and not try 
to dictate where they host it.



-- 
Steven D'Aprano

From steve at pearwood.info  Fri Jun 18 04:35:04 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 18 Jun 2010 12:35:04 +1000
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
In-Reply-To: <4C1A5E0D.7060102@zopyx.com>
References: <4C19A308.5040806@zopyx.com> <4C1A59A9.7030204@egenix.com>
	<4C1A5E0D.7060102@zopyx.com>
Message-ID: <201006181235.04598.steve@pearwood.info>

On Fri, 18 Jun 2010 03:40:29 am Andreas Jung wrote:
> M.-A. Lemburg wrote:
> > If you ship Python-only packages with precompiled .pyc/.pyo
> > files, you do need to upload one version per Python version.
> > The marshal format and pyc magic often changes between releases.
>
> Once again: I am talking about the majority of packages that are
> neither commercial nor shipping without the Python source code.

Firstly, commercial is not the opposite of source-code provided. Why do 
so many FOSS advocates insist on giving the message that it is? *Closed 
source* is the opposite of open source.

You earlier said that PyPI should force all packages to include source 
code. Are you now saying that PyPI should only force packages to 
include source code if they include source code, that is, that package 
owners can opt-out of this rule "you must provide source code" by 
simply not providing source code?

If not, then what exactly are you saying?


> >  * make PyPI itself more robust and stable (we're working on that)
>
> PyPI is pretty robust and this has nothing to do with packages hosted
> externally.

"Pretty robust" isn't robust enough, which is why there are proposals to 
shift PyPI to a commercial high-availability hosting service *and* to 
mirror it extensively.

As for the second part of your statement, of course externally hosted 
packages don't increase the stability of PyPI itself, but they limit 
the harm from any single outage and distribute the load over the entire 
internet rather than one single site.

External hosting is "don't put all your eggs in one basket", as well 
as "competition between hosting providers" and "freedom of choice". 
After all, PyPI is intended to be an *index* of Python software, not a 
hosting service. The hosting is an optional bonus. Don't think I'm not 
grateful for that, but I object strongly to your suggestion that I 
should be *forced* to host my packages on PyPI if I want to register 
the package there.


> > Suggesting that they can never remove a release from PyPI
> > or are not allowed to rename a package is not going to
> > attract more developers to PyPI.
>
> I would not care about such developers. 

Then don't use their packages, but don't stop other people from using 
them.


> The basic question is: do we want PyPI being a reliable and valuable
> community resource or a partly unflushed package toilet?

The basic question is, who has the right to control the packages indexed 
on PyPI? Is it the package author, or you?



-- 
Steven D'Aprano

From ben+python at benfinney.id.au  Fri Jun 18 04:57:33 2010
From: ben+python at benfinney.id.au (Ben Finney)
Date: Fri, 18 Jun 2010 12:57:33 +1000
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
References: <4C19A308.5040806@zopyx.com> <4C1A59A9.7030204@egenix.com>
	<4C1A5E0D.7060102@zopyx.com> <201006181235.04598.steve@pearwood.info>
Message-ID: <87mxutoxnm.fsf@benfinney.id.au>

Steven D'Aprano <steve at pearwood.info> writes:

> On Fri, 18 Jun 2010 03:40:29 am Andreas Jung wrote:
> > The basic question is: do we want PyPI being a reliable and valuable
> > community resource or a partly unflushed package toilet?
>
> The basic question is, who has the right to control the packages indexed 
> on PyPI? Is it the package author, or you?

That doesn't seem to be a question that addresses Andreas's argument (as
I understand it). I don't see Andreas arguing for anyone but the
copyright holder to have control of the *package*.

Rather, a more germane question would be: Who has the right to control
*which* packages get indexed at PyPI (of all those that might be
submitted to the index)?

My understanding is that Andreas is arguing that PyPI does, and should,
have that control; and that control can be exercised in different ways.

-- 
 \      ?Often, the surest way to convey misinformation is to tell the |
  `\               strict truth.? ?Mark Twain, _Following the Equator_ |
_o__)                                                                  |
Ben Finney


From lists at zopyx.com  Fri Jun 18 05:35:44 2010
From: lists at zopyx.com (Andreas Jung)
Date: Fri, 18 Jun 2010 05:35:44 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C1A93EB.9020308@v.loewis.de>
References: <4C19A308.5040806@zopyx.com>	<4C19D4CA.1090304@egenix.com>	<4C19D745.3050900@zopyx.com>	<4C19DCA9.5010308@egenix.com>	<4C19DF6F.9050106@zopyx.com>	<4C19E409.8060603@egenix.com>	<4C19E54F.6030203@zopyx.com>	<4C19F011.6010501@egenix.com>	<hvctat$k9$1@dough.gmane.org>	<4C19FD2A.3050801@egenix.com>	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>	<4C1A0992.7070507@egenix.com>
	<4C1A0D3C.4050402@zopyx.com> <4C1A93EB.9020308@v.loewis.de>
Message-ID: <4C1AE990.8030901@zopyx.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin v. L?wis wrote:
>> In theory yes, in real life no - I approached several package
>> maintainers in the past due to several reasons..some agree with the
>> complaints, others just don't care. Some consider PyPI as their own
>> private repository with their own rules and no need to care about the
>> community e.g. by providing proper metadata (I call this anti-social and
>> PyPI-misuse).
> 
> As the PyPI maintainer, I assure you that it is no misuse. Whether it's
> anti-social, I don't know.

ok - so you claim that it should be allowed for everyone to unload its
unlabeled garbage on public places?
> 
> So given that discussion, I'm now opposed to enforcing a policy here.

Ok - so we have live with PyPI as a package dumpster.

Andreas
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwa6Y8ACgkQCJIWIbr9KYyH5ACcDgFo+H3fjpUWyAWc8L/V+dcP
MU8AnjdVWUfmaPSl3l74kOYAg/rKhdTv
=Ht7H
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lists.vcf
Type: text/x-vcard
Size: 316 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100618/719a3a90/attachment.vcf>

From lists at zopyx.com  Fri Jun 18 05:49:46 2010
From: lists at zopyx.com (Andreas Jung)
Date: Fri, 18 Jun 2010 05:49:46 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C19A308.5040806@zopyx.com>
References: <4C19A308.5040806@zopyx.com>
Message-ID: <4C1AECDA.2000009@zopyx.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I retract this proposal and accepting the fact that obviously nobody
outside the Zope/Plone world is really interested in bringing PyPI
forward and putting the freedom to register and upload packages in
whatever state to PyPI over the needs of a well-maintained and reliable
package index. After almost 20 years I am still under the impression
that we are still in the kindergarten.

Deeply frustrated,
Andreas

Andreas Jung wrote:
> Hi there,
> 
> I propose a policy change for packages registered with PyPI:
> 
>  - packages registered on PyPI have at least one release
> 
>  - one release of registered package on PyPI _must_ contain
>    a valid source code distribution (sdist)
> 
>  - packages registered on PyPI without releases or without
>    source code release are subject to be removed after N days
>    after the day of registration
> 
> Why?
> 
> Any package registered on PyPI is possibly crucial to any kind of
> development and deployment.
> 
> Packages hosted on external servers (referenced through a download_url)
> are subject to come and go - packages once released should be available
> at any time from a well-known location (PyPI). Dependencies on the
> availability of external downloads servers other than PyPI are hardly
> acceptable for real-world development and deployments.
> 
> As an example: the Plone CMS buildouts depend on python-openid.
> This package is registered with PyPI
> 
> http://pypi.python.org/pypi/python-openid
> 
> but references to
> 
> http://openidenabled.com/files/python-openid/packages/python-openid-2.2.4.tar.gz
> 
> For whatever reason the download URL is no longer working. In fact:
> openidenabled.com now points to http://www.janrain.com.
> 
> Other reasons for disappearing package in the past:
> 
>  - network or server outages of external servers
>  - users changed their organization and the organization removed
>    content of their former employees
> 
> PyPI is a valuable and crucial resource for Python development.
> It must be kept up-to-date and consistent.
> 
> I don't care about the arguments that were made in the past against
> stronger rules ("openness" etc.).
> 
> There are a lot of Python programmers around that are not Python geeks
> as most of us are and they just become pissed of when packages come and
> go or are not in the place where one would expect them.
> 
> PyPI is a community resource - but community does not mean anarchy where
> everyone should be able to upload its package crap without looking left
> and right and having the community and its needs in mind.
> 
> PyPI must become a stable package index. Everything registered with PyPI
> must be available at any time (mirrors, distributing PyPI in the cloud...).
> 
> Andreas
> 

- ------------------------------------------------------------------------

_______________________________________________
Catalog-SIG mailing list
Catalog-SIG at python.org
http://mail.python.org/mailman/listinfo/catalog-sig


- -- 
ZOPYX Limited           | zopyx group
Charlottenstr. 37/1     | The full-service network for Zope & Plone
D-72070 T?bingen        | Produce & Publish
www.zopyx.com           | www.produce-and-publish.com
- ------------------------------------------------------------------------
E-Publishing, Python, Zope & Plone development, Consulting


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwa7NoACgkQCJIWIbr9KYxOpgCcD6DBM0ThxmShMrOzFQEAJkye
ZVoAoMavJSWWfTg/3ahy1X3bQ5PN7bLk
=7/GJ
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lists.vcf
Type: text/x-vcard
Size: 316 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100618/391efa57/attachment.vcf>

From fdrake at acm.org  Fri Jun 18 06:02:28 2010
From: fdrake at acm.org (Fred Drake)
Date: Fri, 18 Jun 2010 00:02:28 -0400
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
In-Reply-To: <AANLkTikpajF82Un0JQ4aaGSOZE0kdRsV41BbWiDikyIE@mail.gmail.com>
References: <AANLkTikpajF82Un0JQ4aaGSOZE0kdRsV41BbWiDikyIE@mail.gmail.com>
Message-ID: <AANLkTik88fGbrclS6DIYXY9PyHmQapPalImthzrI5KGL@mail.gmail.com>

On Thu, Jun 17, 2010 at 3:58 PM, Jess Austin <jess.austin at gmail.com> wrote:
> In response, a question: is there anyone who supports this radical policy
> change who is NOT a zc.buildout user?

I'm a zc.buildout user, and I *don't* support this policy change.
This change is entirely unnecessary.


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at gmail.com>
"A storm broke loose in my mind."  --Albert Einstein

From martin at v.loewis.de  Fri Jun 18 07:53:18 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 18 Jun 2010 07:53:18 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C1AE990.8030901@zopyx.com>
References: <4C19A308.5040806@zopyx.com>	<4C19D4CA.1090304@egenix.com>	<4C19D745.3050900@zopyx.com>	<4C19DCA9.5010308@egenix.com>	<4C19DF6F.9050106@zopyx.com>	<4C19E409.8060603@egenix.com>	<4C19E54F.6030203@zopyx.com>	<4C19F011.6010501@egenix.com>	<hvctat$k9$1@dough.gmane.org>	<4C19FD2A.3050801@egenix.com>	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>	<4C1A0992.7070507@egenix.com>
	<4C1A0D3C.4050402@zopyx.com> <4C1A93EB.9020308@v.loewis.de>
	<4C1AE990.8030901@zopyx.com>
Message-ID: <4C1B09CE.5080608@v.loewis.de>

>> As the PyPI maintainer, I assure you that it is no misuse. Whether it's
>> anti-social, I don't know.
>
> ok - so you claim that it should be allowed for everyone to unload its
> unlabeled garbage on public places?

Correct - as long as it's a Python package. Mere spam will be deleted.

>>
>> So given that discussion, I'm now opposed to enforcing a policy here.
>
> Ok - so we have live with PyPI as a package dumpster.

Indeed. I don't think that requiring a source upload would change that.

Regards,
Martin

From simon at ikanobori.jp  Fri Jun 18 10:24:32 2010
From: simon at ikanobori.jp (Simon de Vlieger)
Date: Fri, 18 Jun 2010 10:24:32 +0200
Subject: [Catalog-sig] PyPI template improvements
In-Reply-To: <4C1A8928.8090709@v.loewis.de>
References: <AAD06C81-9B08-4C51-87CD-4C681D91A09F@ikanobori.jp>	<4C194755.2060704@v.loewis.de>
	<hvd3jl$muv$1@dough.gmane.org> <4C1A8928.8090709@v.loewis.de>
Message-ID: <A4976DA1-EA5C-4812-B2B8-2C45D6F9082D@ikanobori.jp>

On 17 jun 2010, at 22:44, Martin v. L?wis wrote:

>> In web app land, "supported browsers" usually means the ones the
>> designer targets:  e.g., including "IE>= 7" in the list means that  
>> the
>> designer doesn't have to include workarounds for stupid glitches in
>> earlier IEs (or even test the design against those versions).
>>
>> For CSS, this means that the site's appearance will be sometimes  
>> wonky
>> when running with an older-than-supported browser version.  Features
>> which depend on Javascript may not work at all, or only in degraded  
>> mode.
>
> I have a really hard time answering that question then: there was no  
> web designer involved in creating PyPI (*). The browser that the
> *authors* of the service target are really the ones I mentioned: all  
> of them.
>
> There is one browser that gets special attention, and flaws relating  
> to it get fixed faster than for any other browser: setuptools.
>
> Regards,
> Martin
>
> (*) of course, it uses the layout of python.org, which did have a  
> web designer; for this design, I don't know the answer.

Martin,

a question from me. Does setuptools browse the main pypi pages or does  
it use the simple version?

Another question is, if there is a need for Javascript on the page  
(don't worry about making it unaccessible, I'll make everything  
degrade nicely) am I allowed to include JavaScript framework. Right  
now I'm looking at jQuery (http://jquery.com/) or would there be  
something against this?

I have already done a few items from my list and a few of the items  
which were proposed by the distutils-sig mailinglist. Over the weekend  
I'm looking at doing a nice chunk of work.

Regards,

Simon de Vlieger


From mal at egenix.com  Fri Jun 18 10:33:42 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 18 Jun 2010 10:33:42 +0200
Subject: [Catalog-sig] PyPI template improvements
In-Reply-To: <A4976DA1-EA5C-4812-B2B8-2C45D6F9082D@ikanobori.jp>
References: <AAD06C81-9B08-4C51-87CD-4C681D91A09F@ikanobori.jp>	<4C194755.2060704@v.loewis.de>	<hvd3jl$muv$1@dough.gmane.org>
	<4C1A8928.8090709@v.loewis.de>
	<A4976DA1-EA5C-4812-B2B8-2C45D6F9082D@ikanobori.jp>
Message-ID: <4C1B2F66.1050502@egenix.com>

Simon de Vlieger wrote:
> On 17 jun 2010, at 22:44, Martin v. L?wis wrote:
> 
>>> In web app land, "supported browsers" usually means the ones the
>>> designer targets:  e.g., including "IE>= 7" in the list means that the
>>> designer doesn't have to include workarounds for stupid glitches in
>>> earlier IEs (or even test the design against those versions).
>>>
>>> For CSS, this means that the site's appearance will be sometimes wonky
>>> when running with an older-than-supported browser version.  Features
>>> which depend on Javascript may not work at all, or only in degraded
>>> mode.
>>
>> I have a really hard time answering that question then: there was no
>> web designer involved in creating PyPI (*). The browser that the
>> *authors* of the service target are really the ones I mentioned: all
>> of them.
>>
>> There is one browser that gets special attention, and flaws relating
>> to it get fixed faster than for any other browser: setuptools.
>>
>> Regards,
>> Martin
>>
>> (*) of course, it uses the layout of python.org, which did have a web
>> designer; for this design, I don't know the answer.
> 
> Martin,
> 
> a question from me. Does setuptools browse the main pypi pages or does
> it use the simple version?

setuptools used to parse the main web pages of PyPI. This was
then changed and the /simple index invented. All recent versions
of setuptools default to using the /simple index.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 18 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                30 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From do3ccqrv at googlemail.com  Fri Jun 18 10:49:04 2010
From: do3ccqrv at googlemail.com (Patrick Gerken)
Date: Fri, 18 Jun 2010 10:49:04 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
In-Reply-To: <hvdd8p$tmb$1@dough.gmane.org>
References: <4C19A308.5040806@zopyx.com> <4C19C7A0.9080800@v.loewis.de> 
	<AANLkTikcyVM8BsJK7n9uDNqSrOXDCgjevCYycBw3xH6L@mail.gmail.com> 
	<hvdd8p$tmb$1@dough.gmane.org>
Message-ID: <AANLkTingS0S5IVnKPD8V4SkoIsd6LfNwTJZAYgDT0Wt7@mail.gmail.com>

On Thu, Jun 17, 2010 at 16:59, Tres Seaver <tseaver at palladion.com> wrote:

> All of which make it impossible to reliably and repeatably deploy
> arbitrary software configurations (directly) from PyPI.  Managing your
> own project-specific index is the only real solution.
>

When I provide buildout configurations for open source packages I
don't like to provide a custom index for them. It increases the effort
if they would like to test the same package with newer versions, want
to update the known good set or would like to extend the package.

For customer project its a very good solution, I agree.

Best regards,

          Patrick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100618/aa6c3b3b/attachment.html>

From mal at egenix.com  Fri Jun 18 11:10:43 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 18 Jun 2010 11:10:43 +0200
Subject: [Catalog-sig] Extra links on the PyPI /simple index package
	pages
In-Reply-To: <4C1A9487.5070108@v.loewis.de>
References: <4C19A308.5040806@zopyx.com>	<4C19D4CA.1090304@egenix.com>	<4C19D745.3050900@zopyx.com>	<4C19DCA9.5010308@egenix.com>	<4C19DF6F.9050106@zopyx.com>	<4C19E409.8060603@egenix.com>	<4C19E54F.6030203@zopyx.com>	<4C19F011.6010501@egenix.com>	<hvctat$k9$1@dough.gmane.org>	<4C19FD2A.3050801@egenix.com>	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>	<4C1A0992.7070507@egenix.com>	<AANLkTikczowGSN5jgGA6l19opbzNpnyJppiVGDFVTxCV@mail.gmail.com>	<4C1A201F.6080609@egenix.com>
	<4C1A9487.5070108@v.loewis.de>
Message-ID: <4C1B3813.3010102@egenix.com>

"Martin v. L?wis" wrote:
> Am 17.06.2010 15:16, schrieb M.-A. Lemburg:
>> Benji York wrote:
>>> On Thu, Jun 17, 2010 at 7:40 AM, M.-A. Lemburg<mal at egenix.com>  wrote:
>>>> http://pypi.python.org/simple/zc.buildout/
>>>>
>>>> BTW: what are all those bug links doing on the zc.buildout index page ?
>>>
>>> PyPI scrapes all the links from the long description; for many projects
>>> that includes a change log with links to fixed bugs.
>>
>> Isn't that dangerous ?
>>
>> AFAIK, setuptools would start opening all those URLs and might
>> find download files which are not necessarily under full control of
>> the author, e.g. anyone could add a comment to a bug report or
>> wiki page with a link to an egg file on some rogue server.
> 
> I think you misunderstand. Links originate *only* from the long
> description. The package owner has full control over that.

I was referring to the linked assets that the package owner
may not have full control over, e.g. in the above case,
you have links pointing to launchpad and one to "file://".

Such links (except the file:// one) can be useful in the
package description, e.g. to point to a bug tracking
system, documentation or other resources, but they are
not really needed to point setuptools to download locations.

> If you think the package owner is opening up a security threat by
> including the links in the first place - yes, that's indeed a risk.

Is this feature still needed for setuptools ?

We have download URLs and homepage URLs which should be enough for
setuptools to search and find the links to package download files.

If it's no longer needed, then it'd be safer not to include
the long description links on the /simple index pages anymore.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 18 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                30 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From ianb at colorstudy.com  Fri Jun 18 17:57:10 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Fri, 18 Jun 2010 10:57:10 -0500
Subject: [Catalog-sig] Extra links on the PyPI /simple index package
	pages
In-Reply-To: <4C1B3813.3010102@egenix.com>
References: <4C19A308.5040806@zopyx.com> <4C19D4CA.1090304@egenix.com> 
	<4C19D745.3050900@zopyx.com> <4C19DCA9.5010308@egenix.com> 
	<4C19DF6F.9050106@zopyx.com> <4C19E409.8060603@egenix.com> 
	<4C19E54F.6030203@zopyx.com> <4C19F011.6010501@egenix.com> 
	<hvctat$k9$1@dough.gmane.org> <4C19FD2A.3050801@egenix.com> 
	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com> 
	<4C1A0992.7070507@egenix.com>
	<AANLkTikczowGSN5jgGA6l19opbzNpnyJppiVGDFVTxCV@mail.gmail.com> 
	<4C1A201F.6080609@egenix.com> <4C1A9487.5070108@v.loewis.de> 
	<4C1B3813.3010102@egenix.com>
Message-ID: <AANLkTimIh5Dt6ogDprm6XqXaZDzI0fJuBOdVERBbL0Nk@mail.gmail.com>

On Fri, Jun 18, 2010 at 4:10 AM, M.-A. Lemburg <mal at egenix.com> wrote:

> > If you think the package owner is opening up a security threat by
> > including the links in the first place - yes, that's indeed a risk.
>
> Is this feature still needed for setuptools ?
>

It's fairly regularly used to link to repositories, e.g., I might put this
text in a description:

  To install `the tip tarball <
http://bitbucket.org/ianb/webob/get/tip.gz#egg=webob-dev>`_ use ``pip
install webob==dev``

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100618/3f955340/attachment.html>

From ianb at colorstudy.com  Fri Jun 18 18:01:44 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Fri, 18 Jun 2010 11:01:44 -0500
Subject: [Catalog-sig] Extra links on the PyPI /simple index package
	pages
In-Reply-To: <AANLkTimIh5Dt6ogDprm6XqXaZDzI0fJuBOdVERBbL0Nk@mail.gmail.com>
References: <4C19A308.5040806@zopyx.com> <4C19D4CA.1090304@egenix.com> 
	<4C19D745.3050900@zopyx.com> <4C19DCA9.5010308@egenix.com> 
	<4C19DF6F.9050106@zopyx.com> <4C19E409.8060603@egenix.com> 
	<4C19E54F.6030203@zopyx.com> <4C19F011.6010501@egenix.com> 
	<hvctat$k9$1@dough.gmane.org> <4C19FD2A.3050801@egenix.com> 
	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com> 
	<4C1A0992.7070507@egenix.com>
	<AANLkTikczowGSN5jgGA6l19opbzNpnyJppiVGDFVTxCV@mail.gmail.com> 
	<4C1A201F.6080609@egenix.com> <4C1A9487.5070108@v.loewis.de> 
	<4C1B3813.3010102@egenix.com>
	<AANLkTimIh5Dt6ogDprm6XqXaZDzI0fJuBOdVERBbL0Nk@mail.gmail.com>
Message-ID: <AANLkTilacuiUvjjekDQW97ke2Y21JiA-XKT1nV37fLg0@mail.gmail.com>

On Fri, Jun 18, 2010 at 10:57 AM, Ian Bicking <ianb at colorstudy.com> wrote:

> On Fri, Jun 18, 2010 at 4:10 AM, M.-A. Lemburg <mal at egenix.com> wrote:
>
>> > If you think the package owner is opening up a security threat by
>> > including the links in the first place - yes, that's indeed a risk.
>>
>> Is this feature still needed for setuptools ?
>>
>
> It's fairly regularly used to link to repositories, e.g., I might put this
> text in a description:
>
>   To install `the tip tarball <
> http://bitbucket.org/ianb/webob/get/tip.gz#egg=webob-dev>`_ use ``pip
> install webob==dev``
>

It should be noted, though, that these links must be self-describing, with
#egg in this case, or with a URL that is more obviously self describing like
http://example.com/nightlies/webob-nightly.tar.gz -- the problems people are
describing here are with fetching other pages and scanning them for links.
If I remember correctly homepage and download_url are fetched and scanned
for links, and those cause all the problems (especially homepage, as
download_url tends to point to something simpler and more reliable).

A simple security hole would be having a homepage that is a wiki -- anyone
could edit the wiki and put up a link to a trojan package and it could get
found and installed.

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100618/6a1b2796/attachment.html>

From ianb at colorstudy.com  Fri Jun 18 18:44:25 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Fri, 18 Jun 2010 11:44:25 -0500
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
Message-ID: <AANLkTikBv7-cLchCk4Zo1CH7I69iO8GOon7eOndE6x48@mail.gmail.com>

With all the reliability discussion, I thought I'd offer a kind of
counterproposal, that we rewrite PyPI to use App Engine.

Of course, this means writing code, etc., but I believe this is a reasonable
goal.  I think if "we" (Catalog-SIG?  PyPI maintainers?) committed to using
such an implementation (assuming it was of good quality) that we could find
people (probably not on this list) to write and maintain the code.  People
have already rewritten PyPI a couple times, but no one knows what exactly to
*do* with the rewrite so they haven't gone anywhere.  And PyPI is not a
particularly complicated application.  I think we can set the bar high on
the implementation quality and that people will meet it, so long as they
know their effort won't be in vain.

Why App Engine?  The primary reason I'm proposing it is because it will be
much easier to manage.  If it runs out of memory it won't bring down a
machine.  If new people maintain the system it's easy to describe how to do
deployments, for instance.  It's easy for people to install their own PyPI
instances for development and to generate patches.  Hosted services can have
downtimes of course, but unlike currently there are other people (the App
Engine maintainers) who will resolve those problems.  There's still a class
of bugs like badly indexed tables or weird locking issues that could bring
PyPI down and "we" would have to fix it, and with a rewrite there's more of
a risk of that, but... it'll just take some testing to make sure things are
okay.

In terms of cost, I expect we can get free hosting, and packages can be
stored directly in the data store.  That doesn't preclude using a CDN like
CloudFront, but that can be handled separately.  Also since the index just
links to packages, packages can be incrementally uploaded to a CDN.

Besides a commitment to using the code (which I think is really important to
motivate people), a scrubbed dump of the database would be really helpful
for development.  I know we've passed around complete dumps to people, but
it contains private information so we can't put it up publicly which creates
a speed bump for developers.


Linkage...
A buzz post where I asked about it:
http://www.google.com/buzz/ianbicking/BRWDjsMCGWQ/I-like-the-original-proposal-move-PyPI-stuff-into

A PyPI *mirror* written for App Engine:
http://pypi.appspot.com/

A PyPI implementation in Django (one is a fork of the other?),
database-backed (would take some work to get it on App Engine):
http://pypi.python.org/pypi/djangopypi/
http://github.com/benliles/chishop


-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100618/fca35e2c/attachment.html>

From mark at geek.net  Fri Jun 18 18:47:21 2010
From: mark at geek.net (Mark Ramm)
Date: Fri, 18 Jun 2010 12:47:21 -0400
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
In-Reply-To: <4C1A96A6.3050101@v.loewis.de>
References: <4C19A308.5040806@zopyx.com> <4C19C7A0.9080800@v.loewis.de>
	<AANLkTikcyVM8BsJK7n9uDNqSrOXDCgjevCYycBw3xH6L@mail.gmail.com>
	<hvdd8p$tmb$1@dough.gmane.org>
	<AANLkTike8MpGVOi2PjDJAhM2zFBrDiOI_W7V6yqI0Hbc@mail.gmail.com>
	<4C1A607E.2030904@zopyx.com>
	<AANLkTilqematflkFtlkxGtmPTHbTfjLo11rnnDKZqKi4@mail.gmail.com>
	<4C1A6383.80105@zopyx.com>
	<AANLkTikU2E6yPUGSlD06WwLm9W21Oxe-LwfflydI1npr@mail.gmail.com>
	<AANLkTinwQRX6dOUq559nYpEX028zTj5-Bfg4I1giDvds@mail.gmail.com>
	<4C1A96A6.3050101@v.loewis.de>
Message-ID: <AANLkTinzXNTK7y200lvFdfqNmN-2Pfsu-BxrR1Y7ihT_@mail.gmail.com>

On Thu, Jun 17, 2010 at 5:41 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> It does? ?I thought PyPI kept everything around (but hidden) unless the
>> author went in and manually deleted old stuff. ?You just need to go to a
>> deep link, e.g., http://pypi.python.org/pypi/SomePackage/0.1
>
> Sure, but owners *do* manually delete old stuff.

Am I wrong in remembering that old packages get dropped from the
simple index?

I'm not saying they get deleted from the server, but they are made
unavailable to easy_install without special knowledge of how to get
them,   So old packages can have requirements in setup.py which become
unavailable  for simple install.

--Mark

From mark at geek.net  Fri Jun 18 18:49:39 2010
From: mark at geek.net (Mark Ramm)
Date: Fri, 18 Jun 2010 12:49:39 -0400
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
In-Reply-To: <4C1A9567.1010703@v.loewis.de>
References: <4C19A308.5040806@zopyx.com> <4C19FD2A.3050801@egenix.com>
	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>
	<201006172255.49175.steve@pearwood.info>
	<AANLkTilo6mcjGO7aW0Z3diXy8J_zQM63-pDwywTPiEp0@mail.gmail.com>
	<4C1A9567.1010703@v.loewis.de>
Message-ID: <AANLkTikqruefSUbnd42ZRtCuWL-we2L0gx8b7VM8r244@mail.gmail.com>

On Thu, Jun 17, 2010 at 5:36 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> Now, please tell me what you would do if sourceforge changes its url and
>> returns a
>> 404 on the old download page. Would you update all release informations?

Well, at this point if sourceforge 404'ed on an old download page (as
opposed to redirecting) I'd get pretty mad, and either fix it or make
somebody fix it.   We depend on easy_install/pip as much as anybody --
so we'd be shooting ourselves in the foot too.

--Mark Ramm

From ianb at colorstudy.com  Fri Jun 18 19:01:48 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Fri, 18 Jun 2010 12:01:48 -0500
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
In-Reply-To: <AANLkTinzXNTK7y200lvFdfqNmN-2Pfsu-BxrR1Y7ihT_@mail.gmail.com>
References: <4C19A308.5040806@zopyx.com> <4C19C7A0.9080800@v.loewis.de> 
	<AANLkTikcyVM8BsJK7n9uDNqSrOXDCgjevCYycBw3xH6L@mail.gmail.com> 
	<hvdd8p$tmb$1@dough.gmane.org>
	<AANLkTike8MpGVOi2PjDJAhM2zFBrDiOI_W7V6yqI0Hbc@mail.gmail.com> 
	<4C1A607E.2030904@zopyx.com>
	<AANLkTilqematflkFtlkxGtmPTHbTfjLo11rnnDKZqKi4@mail.gmail.com> 
	<4C1A6383.80105@zopyx.com>
	<AANLkTikU2E6yPUGSlD06WwLm9W21Oxe-LwfflydI1npr@mail.gmail.com> 
	<AANLkTinwQRX6dOUq559nYpEX028zTj5-Bfg4I1giDvds@mail.gmail.com> 
	<4C1A96A6.3050101@v.loewis.de>
	<AANLkTinzXNTK7y200lvFdfqNmN-2Pfsu-BxrR1Y7ihT_@mail.gmail.com>
Message-ID: <AANLkTikfHE_CBVZQkC1WlGc_ylcoOlhEcIxOD85aBcvl@mail.gmail.com>

On Fri, Jun 18, 2010 at 11:47 AM, Mark Ramm <mark at geek.net> wrote:

> On Thu, Jun 17, 2010 at 5:41 PM, "Martin v. L?wis" <martin at v.loewis.de>
> wrote:
> >> It does?  I thought PyPI kept everything around (but hidden) unless the
> >> author went in and manually deleted old stuff.  You just need to go to a
> >> deep link, e.g., http://pypi.python.org/pypi/SomePackage/0.1
> >
> > Sure, but owners *do* manually delete old stuff.
>
> Am I wrong in remembering that old packages get dropped from the
> simple index?
>
> I'm not saying they get deleted from the server, but they are made
> unavailable to easy_install without special knowledge of how to get
> them,   So old packages can have requirements in setup.py which become
> unavailable  for simple install.
>

If you give pip or easy_install (or I assume buildout) a requirement like
Foo==0.1, then they will look at http://pypi.python.org/simple/Foo/0.1, and
if the release is hidden that will still return the links for that version
of the package.  If you give a version like Foo<=0.1 then it won't work
(assuming 0.1 is hidden), as there's no deep link that either installer will
look at.

A weird case is that links in long_description in old releases will show up
regardless, so if you actually want to purge a link (e.g., to a non-existent
repository) then it require editing all versions of the package.  This might
be unintentional.

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100618/20c0b20e/attachment.html>

From mcrute at gmail.com  Fri Jun 18 21:11:29 2010
From: mcrute at gmail.com (Michael Crute)
Date: Fri, 18 Jun 2010 15:11:29 -0400
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <AANLkTikBv7-cLchCk4Zo1CH7I69iO8GOon7eOndE6x48@mail.gmail.com>
References: <AANLkTikBv7-cLchCk4Zo1CH7I69iO8GOon7eOndE6x48@mail.gmail.com>
Message-ID: <AANLkTimsSO_YRJDuuyIMcIy7UfX6vIJx1rngiQiFPUi4@mail.gmail.com>

On Fri, Jun 18, 2010 at 12:44 PM, Ian Bicking <ianb at colorstudy.com> wrote:
> With all the reliability discussion, I thought I'd offer a kind of
> counterproposal, that we rewrite PyPI to use App Engine.
>
> Of course, this means writing code, etc., but I believe this is a reasonable
> goal.? I think if "we" (Catalog-SIG?? PyPI maintainers?) committed to using
> such an implementation (assuming it was of good quality) that we could find
> people (probably not on this list) to write and maintain the code.? People
> have already rewritten PyPI a couple times, but no one knows what exactly to
> *do* with the rewrite so they haven't gone anywhere.? And PyPI is not a
> particularly complicated application.? I think we can set the bar high on
> the implementation quality and that people will meet it, so long as they
> know their effort won't be in vain.
>
> Why App Engine?? The primary reason I'm proposing it is because it will be
> much easier to manage.? If it runs out of memory it won't bring down a
> machine.? If new people maintain the system it's easy to describe how to do
> deployments, for instance.? It's easy for people to install their own PyPI
> instances for development and to generate patches.? Hosted services can have
> downtimes of course, but unlike currently there are other people (the App
> Engine maintainers) who will resolve those problems.? There's still a class
> of bugs like badly indexed tables or weird locking issues that could bring
> PyPI down and "we" would have to fix it, and with a rewrite there's more of
> a risk of that, but... it'll just take some testing to make sure things are
> okay.
>
> In terms of cost, I expect we can get free hosting, and packages can be
> stored directly in the data store.? That doesn't preclude using a CDN like
> CloudFront, but that can be handled separately.? Also since the index just
> links to packages, packages can be incrementally uploaded to a CDN.
>
> Besides a commitment to using the code (which I think is really important to
> motivate people), a scrubbed dump of the database would be really helpful
> for development.? I know we've passed around complete dumps to people, but
> it contains private information so we can't put it up publicly which creates
> a speed bump for developers.

I would very much like to see pypi start using chishop. I've been
working to implement the complete set of features that pypi supports
(including the mirroring PEP) for use inside of the company I work
for. The code is in reasonably good shape and I would love to see that
become the official implementation of PyPi. Though I haven't tested it
I don't see any reason that it wouldn't run on AppEngine with no
additional work.

-- 
Michael E. Crute
http://mike.crute.org

It is a mistake to think you can solve any major problem just with
potatoes. --Douglas Adams

From pje at telecommunity.com  Fri Jun 18 23:13:40 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Fri, 18 Jun 2010 17:13:40 -0400
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
Message-ID: <20100618211350.41F903A414B@sparrow.telecommunity.com>

At 12:01 PM 6/18/2010 -0500, Ian Bicking wrote:
>On Fri, Jun 18, 2010 at 11:47 AM, Mark Ramm 
><<mailto:mark at geek.net>mark at geek.net> wrote:
>On Thu, Jun 17, 2010 at 5:41 PM, "Martin v. L??wis" 
><<mailto:martin at v.loewis.de>martin at v.loewis.de> wrote:
> >> It does? ? I thought PyPI kept everything around (but hidden) unless the
> >> author went in and manually deleted old stuff. ? You just need to go to a
> >> deep link, e.g., 
> <http://pypi.python.org/pypi/SomePackage/0.1>http://pypi.python.org/pypi/SomePackage/0.1 
>
> >
> > Sure, but owners *do* manually delete old stuff.
>Am I wrong in remembering that old packages get dropped from the
>simple index?
>I'm not saying they get deleted from the server, but they are made
>unavailable to easy_install without special knowledge of how to get
>them, ?  So old packages can have requirements in setup.py which become
>unavailable ? for simple install.
>
>
>If you give pip or easy_install (or I assume buildout) a requirement 
>like Foo==0.1, then they will look at 
><http://pypi.python.org/simple/Foo/0.1>http://pypi.python.org/simple/Foo/0.1,

easy_install doesn't do that, unless you explicitly add that URL via 
-f or --find-links.  Is that a feature you added in pip?


>and if the release is hidden that will still return the links for 
>that version of the package.?  If you give a version like Foo<=0.1 
>then it won't work (assuming 0.1 is hidden), as there's no deep link 
>that either installer will look at.
>
>A weird case is that links in long_description in old releases will 
>show up regardless, so if you actually want to purge a link (e.g., 
>to a non-existent repository) then it require editing all versions 
>of the package.?  This might be unintentional.

It's at least consistent -- all URLs for all versions (whether hidden 
or not) show up when you access the packagewide page.

There was some discussion in the past about whether this was 
appropriate; IMO it's not, as it was an effective API change from the 
pre-/simple days.  Before, if a release was hidden, there was no way 
for easy_install to find it except via explicit -f usage.  Now, there 
is no way for an author to hide a release from automatic installation 
and still allow for manual installation.


From pje at telecommunity.com  Fri Jun 18 23:14:31 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Fri, 18 Jun 2010 17:14:31 -0400
Subject: [Catalog-sig] Extra links on the PyPI /simple index package
 pages
Message-ID: <20100618211441.6EEE93A414B@sparrow.telecommunity.com>

At 11:01 AM 6/18/2010 -0500, Ian Bicking wrote:
>A simple security hole would be having a homepage that is a wiki -- 
>anyone could edit the wiki and put up a link to a trojan package and 
>it could get found and installed.

Of course, that's also a security hole even if you're *not* using an 
automated installation.  


From pje at telecommunity.com  Fri Jun 18 23:14:45 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Fri, 18 Jun 2010 17:14:45 -0400
Subject: [Catalog-sig] Extra links on the PyPI /simple index package
 pages
Message-ID: <20100618211455.2BDE73A414B@sparrow.telecommunity.com>

At 11:10 AM 6/18/2010 +0200, M.-A. Lemburg wrote:
>"Martin v. L?wis" wrote:
> > Am 17.06.2010 15:16, schrieb M.-A. Lemburg:
> >> Benji York wrote:
> >>> On Thu, Jun 17, 2010 at 7:40 AM, M.-A. Lemburg<mal at egenix.com>  wrote:
> >>>> http://pypi.python.org/simple/zc.buildout/
> >>>>
> >>>> BTW: what are all those bug links doing on the zc.buildout index page ?
> >>>
> >>> PyPI scrapes all the links from the long description; for many projects
> >>> that includes a change log with links to fixed bugs.
> >>
> >> Isn't that dangerous ?
> >>
> >> AFAIK, setuptools would start opening all those URLs and might
> >> find download files which are not necessarily under full control of
> >> the author, e.g. anyone could add a comment to a bug report or
> >> wiki page with a link to an egg file on some rogue server.
> >
> > I think you misunderstand. Links originate *only* from the long
> > description. The package owner has full control over that.
>
>I was referring to the linked assets that the package owner
>may not have full control over, e.g. in the above case,
>you have links pointing to launchpad and one to "file://".
>
>Such links (except the file:// one) can be useful in the
>package description, e.g. to point to a bug tracking
>system, documentation or other resources, but they are
>not really needed to point setuptools to download locations.

This is a misunderstanding of what setuptools does.  Setuptools only 
retrieves URLs that are *specifically designated* as a "home page" or 
"download" link (using the "rel" field of the A tag on the PyPI 
/simple page), or which are a recognizable download URL supplied by 
way of the long_description.

So, the risk you are describing does not actually exist.


> > If you think the package owner is opening up a security threat by
> > including the links in the first place - yes, that's indeed a risk.
>
>Is this feature still needed for setuptools ?

Yes.


>We have download URLs and homepage URLs which should be enough for
>setuptools to search and find the links to package download files.

No.  This would only be the case if the project's author had some 
other form of hosting.  For example, if you had a subversion 
repository for your development trunk, but didn't have any place to 
host an HTML page to link to it, the long_description would be the 
only way (AFAIK at present) for you to securely provide a link to 
that repository for setuptools (or humans) to use.

See also:

   http://peak.telecommunity.com/DevCenter/setuptools#making-your-package-available-for-easyinstall

and:

   http://peak.telecommunity.com/DevCenter/PackageIndexAPI

for more information on how the link parsing and retrieval works.

It is a common misconception that setuptools spiders pages for links; 
the truth is, it only reads the "home" and "download" URLs provided 
via the PyPI metadata, and those only if they're not obviously links 
to a package tarball (or zip, egg, etc.).  All other links must 
visibly point to something downloadable, or else they're ignored.

That means unless your bug tracking system's URL ends with 
"/myproject-1.2.tgz", it ain't gonna get downloaded.  And unless you 
used it as your "home page" link, it won't be searched for links, either.  ;-)


From ziade.tarek at gmail.com  Fri Jun 18 23:39:32 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Fri, 18 Jun 2010 23:39:32 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <AANLkTiloZrUuVoz0ZVPVi8Ih3uxnAgFqtih_mkcZZjaJ@mail.gmail.com>
References: <4C1768AF.9040606@egenix.com>
	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>
	<4C17A419.4060602@egenix.com>
	<AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com>
	<A447F7AF-18D4-4E07-8F9B-BDA0E6BC92D2@mac.com>
	<AANLkTilm3RLO5Z5uJzCtjYUqmACLOl-jWucgQv13IicH@mail.gmail.com>
	<4C17BBC3.3050205@egenix.com>
	<AANLkTikXWnAlk_MVppqgCX6WhouDvqC0w8fPUJRpR6HK@mail.gmail.com>
	<loom.20100616T135105-60@post.gmane.org>
	<4C1919F1.9080506@v.loewis.de>
	<AANLkTiloZrUuVoz0ZVPVi8Ih3uxnAgFqtih_mkcZZjaJ@mail.gmail.com>
Message-ID: <AANLkTinpEvFwA42qQA31zbYKdpOGDjmPCDpxbS6XMO-c@mail.gmail.com>

On Thu, Jun 17, 2010 at 6:30 AM, Ian Bicking <ianb at colorstudy.com> wrote:
> On Wed, Jun 16, 2010 at 1:37 PM, "Martin v. L?wis" <martin at v.loewis.de>
> wrote:
>>>
>>> It is likely that some people will setup a mirror and then "forget" to
>>> take care
>>> about it. Like our buildbots really.
>>
>>
>> The same can happen to any infrastructure, though. Amazon may decide to
>> change the setup, and then the automated update procedure would break.
>> Of course, they would give advance notice - but then somebody would
>> have to react to that advance notice.
>
> That's not very likely, and if something does change it will be extremely
> well announced and documented.? Amazon is providing a commercial service
> lots of people rely on, their process is formalized and professionalized.
> And if Amazon makes mistakes they'll figure out how to avoid them next time,
> while mirror providers are a rotating crew that is unlikely to easily or
> reliably learn from past mistakes.

if a mirror manager don't do a good job, he'll just be taken out of
the ring after a while.
If we depend 100% on Amazon, and if there's a problem, the mirroring
will be down for the time being and we won't be able to do nothing
about it.

> If we actually understood each time PyPI
> broke and fixed it none of this would be a problem; I'm not blaming anyone
> for that, but it's also not going to change and adding lots of mirror
> systems just adds more systems with exactly the same management problems
> that our current system has.

Yes but the difference is that you don't put all your eggs in the same basket:
it's very unlikely that ALL community mirrors will be down at the same
time, thus
a fall-back mechanism on the client side will raise the availability
automatically.

About Amazon: what will happen in 5 years with their offer ? will our
Cloud-PyPI infrastructure will still work ?  what will be the workload
to maintain it ? You can't
be 100% sure the Python community will be able to dedicate that time.
PyPI works today because it is not forced by a third party to evolve,
it can evolve as its own pace.

On the contrary, once the mirrors system is set, it will be dead easy
to add/remove a mirror in the ring, and each node won't act as a SPOF

IMHO it's a bad idea to make this piece of our infrastructure depend
on one third party commercial entity, where we can provide a community
answer.

Now, a mirror could use Amazon, that would make more sense to me.

Regards
Tarek


>
> --
> Ian Bicking ?| ?http://blog.ianbicking.org
>
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>
>



-- 
Tarek Ziad? | http://ziade.org

From exarkun at twistedmatrix.com  Fri Jun 18 23:47:00 2010
From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com)
Date: Fri, 18 Jun 2010 21:47:00 -0000
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud
	for	better availability
In-Reply-To: <AANLkTinpEvFwA42qQA31zbYKdpOGDjmPCDpxbS6XMO-c@mail.gmail.com>
References: <4C1768AF.9040606@egenix.com>
	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>
	<4C17A419.4060602@egenix.com>
	<AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com>
	<A447F7AF-18D4-4E07-8F9B-BDA0E6BC92D2@mac.com>
	<AANLkTilm3RLO5Z5uJzCtjYUqmACLOl-jWucgQv13IicH@mail.gmail.com>
	<4C17BBC3.3050205@egenix.com>
	<AANLkTikXWnAlk_MVppqgCX6WhouDvqC0w8fPUJRpR6HK@mail.gmail.com>
	<loom.20100616T135105-60@post.gmane.org>
	<4C1919F1.9080506@v.loewis.de>
	<AANLkTiloZrUuVoz0ZVPVi8Ih3uxnAgFqtih_mkcZZjaJ@mail.gmail.com>
	<AANLkTinpEvFwA42qQA31zbYKdpOGDjmPCDpxbS6XMO-c@mail.gmail.com>
Message-ID: <20100618214700.2412.1860572271.divmod.xquotient.104@localhost.localdomain>

On 09:39 pm, ziade.tarek at gmail.com wrote:
>On Thu, Jun 17, 2010 at 6:30 AM, Ian Bicking <ianb at colorstudy.com> 
>wrote:
>>On Wed, Jun 16, 2010 at 1:37 PM, "Martin v. L?wis" 
>><martin at v.loewis.de>
>>wrote:
>>>>
>>>>It is likely that some people will setup a mirror and then "forget" 
>>>>to
>>>>take care
>>>>about it. Like our buildbots really.
>>>
>>>
>>>The same can happen to any infrastructure, though. Amazon may decide 
>>>to
>>>change the setup, and then the automated update procedure would 
>>>break.
>>>Of course, they would give advance notice - but then somebody would
>>>have to react to that advance notice.
>>
>>That's not very likely, and if something does change it will be 
>>extremely
>>well announced and documented.? Amazon is providing a commercial 
>>service
>>lots of people rely on, their process is formalized and 
>>professionalized.
>>And if Amazon makes mistakes they'll figure out how to avoid them next 
>>time,
>>while mirror providers are a rotating crew that is unlikely to easily 
>>or
>>reliably learn from past mistakes.
>
>if a mirror manager don't do a good job, he'll just be taken out of
>the ring after a while.
>If we depend 100% on Amazon, and if there's a problem, the mirroring
>will be down for the time being and we won't be able to do nothing
>about it.
>>If we actually understood each time PyPI
>>broke and fixed it none of this would be a problem; I'm not blaming 
>>anyone
>>for that, but it's also not going to change and adding lots of mirror
>>systems just adds more systems with exactly the same management 
>>problems
>>that our current system has.
>
>Yes but the difference is that you don't put all your eggs in the same 
>basket:
>it's very unlikely that ALL community mirrors will be down at the same
>time, thus
>a fall-back mechanism on the client side will raise the availability
>automatically.
>
>About Amazon: what will happen in 5 years with their offer ? will our
>Cloud-PyPI infrastructure will still work ?  what will be the workload
>to maintain it ? You can't
>be 100% sure the Python community will be able to dedicate that time.
>PyPI works today because it is not forced by a third party to evolve,
>it can evolve as its own pace.
>
>On the contrary, once the mirrors system is set, it will be dead easy
>to add/remove a mirror in the ring, and each node won't act as a SPOF
>
>IMHO it's a bad idea to make this piece of our infrastructure depend
>on one third party commercial entity, where we can provide a community
>answer.

There are (multiple!) open source implementations of the Amazon API.  If 
Amazon decides to discontinue their cloud services (something I doubt 
should really be one of the top ten concerns here), then anyone else can 
set up their own cloud with the same interface.

If I were going to run a PyPI mirroring service, I'd probably want to do 
it this way *anyway* because managing virtual machines is far easier 
than managing actual hardware.

So there are probably many other much more significant issues to be 
worrying about.

Jean-Paul

From ianb at colorstudy.com  Sat Jun 19 00:05:29 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Fri, 18 Jun 2010 17:05:29 -0500
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
	source code distribution on PyPI
In-Reply-To: <20100618211350.41F903A414B@sparrow.telecommunity.com>
References: <20100618211350.41F903A414B@sparrow.telecommunity.com>
Message-ID: <AANLkTinjgHJNKgDeuCw-t48NvMfcgm0M6Rqws3Wrgoe8@mail.gmail.com>

On Fri, Jun 18, 2010 at 4:13 PM, P.J. Eby <pje at telecommunity.com> wrote:

>  If you give pip or easy_install (or I assume buildout) a requirement like
>> Foo==0.1, then they will look at <http://pypi.python.org/simple/Foo/0.1>
>> http://pypi.python.org/simple/Foo/0.1,
>>
>
>
> easy_install doesn't do that, unless you explicitly add that URL via -f or
> --find-links.  Is that a feature you added in pip?
>

Hmm... somehow I imagined I was copying easy_install functionality when I
added that, but I guess not.  But yes, pip does look at a version-specific
link as a special case.

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100618/bd6e5b8f/attachment.html>

From martin at v.loewis.de  Sat Jun 19 00:33:19 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 19 Jun 2010 00:33:19 +0200
Subject: [Catalog-sig] PyPI template improvements
In-Reply-To: <A4976DA1-EA5C-4812-B2B8-2C45D6F9082D@ikanobori.jp>
References: <AAD06C81-9B08-4C51-87CD-4C681D91A09F@ikanobori.jp>	<4C194755.2060704@v.loewis.de>
	<hvd3jl$muv$1@dough.gmane.org> <4C1A8928.8090709@v.loewis.de>
	<A4976DA1-EA5C-4812-B2B8-2C45D6F9082D@ikanobori.jp>
Message-ID: <4C1BF42F.9050300@v.loewis.de>


> a question from me. Does setuptools browse the main pypi pages or does
> it use the simple version?

Both. Old versions (which still need to be supported) go to the main 
pages; new versions to the simple index. IOW, you need to maintain all 
links on the main pages that also exist on the simple pages.

> Another question is, if there is a need for Javascript on the page
> (don't worry about making it unaccessible, I'll make everything degrade
> nicely) am I allowed to include JavaScript framework. Right now I'm
> looking at jQuery (http://jquery.com/) or would there be something
> against this?

Not sure how this can be deployed, but if you come up with a solution, 
that's fine with me.

Regards,
Martin

From martin at v.loewis.de  Sat Jun 19 00:34:00 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 19 Jun 2010 00:34:00 +0200
Subject: [Catalog-sig] PyPI template improvements
In-Reply-To: <4C1B2F66.1050502@egenix.com>
References: <AAD06C81-9B08-4C51-87CD-4C681D91A09F@ikanobori.jp>	<4C194755.2060704@v.loewis.de>	<hvd3jl$muv$1@dough.gmane.org>
	<4C1A8928.8090709@v.loewis.de>
	<A4976DA1-EA5C-4812-B2B8-2C45D6F9082D@ikanobori.jp>
	<4C1B2F66.1050502@egenix.com>
Message-ID: <4C1BF458.8030503@v.loewis.de>

> setuptools used to parse the main web pages of PyPI. This was
> then changed and the /simple index invented. All recent versions
> of setuptools default to using the /simple index.

See my response, though. The old versions still need to be supported.

Regards,
Martin

From martin at v.loewis.de  Sat Jun 19 00:47:24 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 19 Jun 2010 00:47:24 +0200
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <AANLkTimsSO_YRJDuuyIMcIy7UfX6vIJx1rngiQiFPUi4@mail.gmail.com>
References: <AANLkTikBv7-cLchCk4Zo1CH7I69iO8GOon7eOndE6x48@mail.gmail.com>
	<AANLkTimsSO_YRJDuuyIMcIy7UfX6vIJx1rngiQiFPUi4@mail.gmail.com>
Message-ID: <4C1BF77C.10306@v.loewis.de>

> I would very much like to see pypi start using chishop. I've been
> working to implement the complete set of features that pypi supports
> (including the mirroring PEP) for use inside of the company I work
> for. The code is in reasonably good shape and I would love to see that
> become the official implementation of PyPi. Though I haven't tested it
> I don't see any reason that it wouldn't run on AppEngine with no
> additional work.

AFAICT, it is still way off being a replacement for PyPI. Where are the 
rendered web pages? Where is the account management? Where is file
upload, documentation upload? Browsing for classifiers? and so on.

This looks just like the simple index to me.

Regards,
Martin

From mcrute at gmail.com  Sat Jun 19 01:05:21 2010
From: mcrute at gmail.com (Michael Crute)
Date: Fri, 18 Jun 2010 19:05:21 -0400
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <4C1BF77C.10306@v.loewis.de>
References: <AANLkTikBv7-cLchCk4Zo1CH7I69iO8GOon7eOndE6x48@mail.gmail.com> 
	<AANLkTimsSO_YRJDuuyIMcIy7UfX6vIJx1rngiQiFPUi4@mail.gmail.com> 
	<4C1BF77C.10306@v.loewis.de>
Message-ID: <AANLkTil3cZX4pg3naYpoLEpdMJJWm5Kro36_uFac62s-@mail.gmail.com>

On Fri, Jun 18, 2010 at 6:47 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> I would very much like to see pypi start using chishop. I've been
>> working to implement the complete set of features that pypi supports
>> (including the mirroring PEP) for use inside of the company I work
>> for. The code is in reasonably good shape and I would love to see that
>> become the official implementation of PyPi. Though I haven't tested it
>> I don't see any reason that it wouldn't run on AppEngine with no
>> additional work.
>
> AFAICT, it is still way off being a replacement for PyPI. Where are the
> rendered web pages? Where is the account management? Where is file
> upload, documentation upload? Browsing for classifiers? and so on.
>
> This looks just like the simple index to me.

Yes, in it's current state it's pretty basic. We are working on
rolling out an internal version of PyPi at work to assist with
distribution of our applications so I'm working on full compatibility
with the official PyPi. We're still a little ways out but are moving
in the right direction.

I'm maintaining a todo list within my fork at
http://github.com/mcrute/chishop/blob/master/TODO and would very much
appreciate any input you might have as to which features are most
important for official compatibility and what is missing from that
list.

-- 
Michael E. Crute
http://mike.crute.org

It is a mistake to think you can solve any major problem just with
potatoes. --Douglas Adams

From martin at v.loewis.de  Sat Jun 19 01:07:51 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 19 Jun 2010 01:07:51 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <AANLkTinzXNTK7y200lvFdfqNmN-2Pfsu-BxrR1Y7ihT_@mail.gmail.com>
References: <4C19A308.5040806@zopyx.com>	<4C19C7A0.9080800@v.loewis.de>	<AANLkTikcyVM8BsJK7n9uDNqSrOXDCgjevCYycBw3xH6L@mail.gmail.com>	<hvdd8p$tmb$1@dough.gmane.org>	<AANLkTike8MpGVOi2PjDJAhM2zFBrDiOI_W7V6yqI0Hbc@mail.gmail.com>	<4C1A607E.2030904@zopyx.com>	<AANLkTilqematflkFtlkxGtmPTHbTfjLo11rnnDKZqKi4@mail.gmail.com>	<4C1A6383.80105@zopyx.com>	<AANLkTikU2E6yPUGSlD06WwLm9W21Oxe-LwfflydI1npr@mail.gmail.com>	<AANLkTinwQRX6dOUq559nYpEX028zTj5-Bfg4I1giDvds@mail.gmail.com>	<4C1A96A6.3050101@v.loewis.de>
	<AANLkTinzXNTK7y200lvFdfqNmN-2Pfsu-BxrR1Y7ihT_@mail.gmail.com>
Message-ID: <4C1BFC47.8060301@v.loewis.de>

Am 18.06.2010 18:47, schrieb Mark Ramm:
> On Thu, Jun 17, 2010 at 5:41 PM, "Martin v. L?wis"<martin at v.loewis.de>  wrote:
>>> It does?  I thought PyPI kept everything around (but hidden) unless the
>>> author went in and manually deleted old stuff.  You just need to go to a
>>> deep link, e.g., http://pypi.python.org/pypi/SomePackage/0.1
>>
>> Sure, but owners *do* manually delete old stuff.
>
> Am I wrong in remembering that old packages get dropped from the
> simple index?

You are indeed misremembering. They used to, but don't, any longer, on 
user request.

Regards,
Martin


From ziade.tarek at gmail.com  Sat Jun 19 01:08:28 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Sat, 19 Jun 2010 01:08:28 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <20100618214700.2412.1860572271.divmod.xquotient.104@localhost.localdomain>
References: <4C1768AF.9040606@egenix.com>
	<AANLkTinuDM7DMepkUoYRbVdgjo3R3ceLyvFY99-aMAYA@mail.gmail.com>
	<4C17A419.4060602@egenix.com>
	<AANLkTimP2j3GHoqWSvDed1wotHQI87iiNvD5BNkrWafF@mail.gmail.com>
	<A447F7AF-18D4-4E07-8F9B-BDA0E6BC92D2@mac.com>
	<AANLkTilm3RLO5Z5uJzCtjYUqmACLOl-jWucgQv13IicH@mail.gmail.com>
	<4C17BBC3.3050205@egenix.com>
	<AANLkTikXWnAlk_MVppqgCX6WhouDvqC0w8fPUJRpR6HK@mail.gmail.com>
	<loom.20100616T135105-60@post.gmane.org>
	<4C1919F1.9080506@v.loewis.de>
	<AANLkTiloZrUuVoz0ZVPVi8Ih3uxnAgFqtih_mkcZZjaJ@mail.gmail.com>
	<AANLkTinpEvFwA42qQA31zbYKdpOGDjmPCDpxbS6XMO-c@mail.gmail.com>
	<20100618214700.2412.1860572271.divmod.xquotient.104@localhost.localdomain>
Message-ID: <AANLkTinMDv1mtwdP7C4BovI0L8BxWSFPVRSOVNa2nVG-@mail.gmail.com>

On Fri, Jun 18, 2010 at 11:47 PM,  <exarkun at twistedmatrix.com> wrote:
[..]
>
> There are (multiple!) open source implementations of the Amazon API. ?If
> Amazon decides to discontinue their cloud services (something I doubt should
> really be one of the top ten concerns here), then anyone else can set up
> their own cloud with the same interface.
>
> If I were going to run a PyPI mirroring service, I'd probably want to do it
> this way *anyway* because managing virtual machines is far easier than
> managing actual hardware.

I am not arguing in particular against Amazon, or any other service.
This is an implementation detail.

My point is that having a ring of mirrors (whatever technology each
one of these mirror
uses) is better than setting up an infrastructure at Amazon ourselves
(we will have to maintain), to solve our availability issues.

Exactly because "anyone else can set up their own cloud (or whatever)
with the same interface".

In other words, the mirroring protocol is the interface that will give
us this availability, by switching to a server that is available, when
another one is down, be it the main PyPI itself

> So there are probably many other much more significant issues to be worrying
> about.

Not sure what you mean here. If it's in general, I completely agree. I
have a very long list :)

Regards

Tarek

-- 
Tarek Ziad? | http://ziade.org

From ziade.tarek at gmail.com  Sat Jun 19 01:27:39 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Sat, 19 Jun 2010 01:27:39 +0200
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <AANLkTikBv7-cLchCk4Zo1CH7I69iO8GOon7eOndE6x48@mail.gmail.com>
References: <AANLkTikBv7-cLchCk4Zo1CH7I69iO8GOon7eOndE6x48@mail.gmail.com>
Message-ID: <AANLkTikvLLni5t9E_8nz7oR27rRyZYVngiSfWPFCoy8g@mail.gmail.com>

On Fri, Jun 18, 2010 at 6:44 PM, Ian Bicking <ianb at colorstudy.com> wrote:
> With all the reliability discussion, I thought I'd offer a kind of
> counterproposal, that we rewrite PyPI to use App Engine.
>
> Of course, this means writing code, etc., but I believe this is a reasonable
> goal.? I think if "we" (Catalog-SIG?? PyPI maintainers?) committed to using
> such an implementation (assuming it was of good quality) that we could find
> people (probably not on this list) to write and maintain the code.? People
> have already rewritten PyPI a couple times, but no one knows what exactly to
> *do* with the rewrite so they haven't gone anywhere.? And PyPI is not a
> particularly complicated application.? I think we can set the bar high on
> the implementation quality and that people will meet it, so long as they
> know their effort won't be in vain.

Out of curiosity : have you ever worked with the current implementation ?

I have hard time to understand why some people say it's hard to work with it,
I don't think its a valid argument.

>
> Why App Engine?? The primary reason I'm proposing it is because it will be
> much easier to manage.? If it runs out of memory it won't bring down a
> machine.? If new people maintain the system it's easy to describe how to do
> deployments, for instance.? It's easy for people to install their own PyPI
> instances for development and to generate patches.? Hosted services can have
> downtimes of course, but unlike currently there are other people (the App
> Engine maintainers) who will resolve those problems.? There's still a class
> of bugs like badly indexed tables or weird locking issues that could bring
> PyPI down and "we" would have to fix it, and with a rewrite there's more of
> a risk of that, but... it'll just take some testing to make sure things are
> okay.
>
> In terms of cost, I expect we can get free hosting, and packages can be
> stored directly in the data store.? That doesn't preclude using a CDN like
> CloudFront, but that can be handled separately.? Also since the index just
> links to packages, packages can be incrementally uploaded to a CDN.

Even if I don't think its a priority in our concerns (community
mirrors come first), I wouldn't mind having the main PyPI UI in GAE.

Although, if PyPI was to be ported to GAE, couldn't we reuse the
existing code instead of rewriting from scratch ? we would just have
to rewrite the DB layer.

> Besides a commitment to using the code (which I think is really important to
> motivate people), a scrubbed dump of the database would be really helpful
> for development.? I know we've passed around complete dumps to people, but
> it contains private information so we can't put it up publicly which creates
> a speed bump for developers.

Private information could be easily removed from those dumps;

But I don't think it's so helpful since you have all the .sql scripts to create
your own DB. But we could add a script to create some sample data on
the top of those scripts.

>
>
> Linkage...
> A buzz post where I asked about it:
> http://www.google.com/buzz/ianbicking/BRWDjsMCGWQ/I-like-the-original-proposal-move-PyPI-stuff-into
>
> A PyPI *mirror* written for App Engine:
> http://pypi.appspot.com/
>
> A PyPI implementation in Django (one is a fork of the other?),
> database-backed (would take some work to get it on App Engine):
> http://pypi.python.org/pypi/djangopypi/
> http://github.com/benliles/chishop
>
>
> --
> Ian Bicking ?| ?http://blog.ianbicking.org
>
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>
>



-- 
Tarek Ziad? | http://ziade.org

From martin at v.loewis.de  Sat Jun 19 01:51:38 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 19 Jun 2010 01:51:38 +0200
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <AANLkTil3cZX4pg3naYpoLEpdMJJWm5Kro36_uFac62s-@mail.gmail.com>
References: <AANLkTikBv7-cLchCk4Zo1CH7I69iO8GOon7eOndE6x48@mail.gmail.com>
	<AANLkTimsSO_YRJDuuyIMcIy7UfX6vIJx1rngiQiFPUi4@mail.gmail.com>
	<4C1BF77C.10306@v.loewis.de>
	<AANLkTil3cZX4pg3naYpoLEpdMJJWm5Kro36_uFac62s-@mail.gmail.com>
Message-ID: <4C1C068A.4000607@v.loewis.de>


> I'm maintaining a todo list within my fork at
> http://github.com/mcrute/chishop/blob/master/TODO and would very much
> appreciate any input you might have as to which features are most
> important for official compatibility and what is missing from that
> list.

The absolute requirement is that any URLs that PyPI provides must work 
exactly the same way. Primarily, this means
- package pages
- browse interface
- RSS

In addition, a number of features aren't listed yet:
- web registration of users
- web password reset
- OpenID support

Regards,
Martin

From pje at telecommunity.com  Sat Jun 19 01:57:45 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Fri, 18 Jun 2010 19:57:45 -0400
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <4C1BFC47.8060301@v.loewis.de>
References: <4C19A308.5040806@zopyx.com> <4C19C7A0.9080800@v.loewis.de>
	<AANLkTikcyVM8BsJK7n9uDNqSrOXDCgjevCYycBw3xH6L@mail.gmail.com>
	<hvdd8p$tmb$1@dough.gmane.org>
	<AANLkTike8MpGVOi2PjDJAhM2zFBrDiOI_W7V6yqI0Hbc@mail.gmail.com>
	<4C1A607E.2030904@zopyx.com>
	<AANLkTilqematflkFtlkxGtmPTHbTfjLo11rnnDKZqKi4@mail.gmail.com>
	<4C1A6383.80105@zopyx.com>
	<AANLkTikU2E6yPUGSlD06WwLm9W21Oxe-LwfflydI1npr@mail.gmail.com>
	<AANLkTinwQRX6dOUq559nYpEX028zTj5-Bfg4I1giDvds@mail.gmail.com>
	<4C1A96A6.3050101@v.loewis.de>
	<AANLkTinzXNTK7y200lvFdfqNmN-2Pfsu-BxrR1Y7ihT_@mail.gmail.com>
	<4C1BFC47.8060301@v.loewis.de>
Message-ID: <20100618235804.0B9DA3A40A5@sparrow.telecommunity.com>

At 01:07 AM 6/19/2010 +0200, Martin v. L?wis wrote:
>Am 18.06.2010 18:47, schrieb Mark Ramm:
>>On Thu, Jun 17, 2010 at 5:41 PM, "Martin v. 
>>L?wis"<martin at v.loewis.de>  wrote:
>>>>It does?  I thought PyPI kept everything around (but hidden) unless the
>>>>author went in and manually deleted old stuff.  You just need to go to a
>>>>deep link, e.g., http://pypi.python.org/pypi/SomePackage/0.1
>>>
>>>Sure, but owners *do* manually delete old stuff.
>>
>>Am I wrong in remembering that old packages get dropped from the
>>simple index?
>
>You are indeed misremembering. They used to, but don't, any longer, 
>on user request.

How many users?  I'm thinking it might be better to meet this use 
case the way pip does -- i.e., look up the specific version when a 
specific hidden version is requested, but otherwise only show active 
versions.  The current behavior makes it harder for package authors 
to control what versions are automatically installable by default.


From ianb at colorstudy.com  Sat Jun 19 01:58:00 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Fri, 18 Jun 2010 18:58:00 -0500
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <AANLkTikvLLni5t9E_8nz7oR27rRyZYVngiSfWPFCoy8g@mail.gmail.com>
References: <AANLkTikBv7-cLchCk4Zo1CH7I69iO8GOon7eOndE6x48@mail.gmail.com> 
	<AANLkTikvLLni5t9E_8nz7oR27rRyZYVngiSfWPFCoy8g@mail.gmail.com>
Message-ID: <AANLkTimfRBpUr3uTBhoD9DbR5V46mWaSvzGV9xNKp2uX@mail.gmail.com>

On Fri, Jun 18, 2010 at 6:27 PM, Tarek Ziad? <ziade.tarek at gmail.com> wrote:

> On Fri, Jun 18, 2010 at 6:44 PM, Ian Bicking <ianb at colorstudy.com> wrote:
> > With all the reliability discussion, I thought I'd offer a kind of
> > counterproposal, that we rewrite PyPI to use App Engine.
> >
> > Of course, this means writing code, etc., but I believe this is a
> reasonable
> > goal.  I think if "we" (Catalog-SIG?  PyPI maintainers?) committed to
> using
> > such an implementation (assuming it was of good quality) that we could
> find
> > people (probably not on this list) to write and maintain the code.
> People
> > have already rewritten PyPI a couple times, but no one knows what exactly
> to
> > *do* with the rewrite so they haven't gone anywhere.  And PyPI is not a
> > particularly complicated application.  I think we can set the bar high on
> > the implementation quality and that people will meet it, so long as they
> > know their effort won't be in vain.
>
> Out of curiosity : have you ever worked with the current implementation ?
>
> I have hard time to understand why some people say it's hard to work with
> it,
> I don't think its a valid argument.
>

I haven't looked at it in years, but I've poked around it some.  I found it
difficult, yes.


> > Why App Engine?  The primary reason I'm proposing it is because it will
> be
> > much easier to manage.  If it runs out of memory it won't bring down a
> > machine.  If new people maintain the system it's easy to describe how to
> do
> > deployments, for instance.  It's easy for people to install their own
> PyPI
> > instances for development and to generate patches.  Hosted services can
> have
> > downtimes of course, but unlike currently there are other people (the App
> > Engine maintainers) who will resolve those problems.  There's still a
> class
> > of bugs like badly indexed tables or weird locking issues that could
> bring
> > PyPI down and "we" would have to fix it, and with a rewrite there's more
> of
> > a risk of that, but... it'll just take some testing to make sure things
> are
> > okay.
> >
> > In terms of cost, I expect we can get free hosting, and packages can be
> > stored directly in the data store.  That doesn't preclude using a CDN
> like
> > CloudFront, but that can be handled separately.  Also since the index
> just
> > links to packages, packages can be incrementally uploaded to a CDN.
>
> Even if I don't think its a priority in our concerns (community
> mirrors come first), I wouldn't mind having the main PyPI UI in GAE.
>

The priorities that motivate me are:

1. Make installation more reliable with respect to PyPI
2. Decrease overall maintenance burden
3. Decrease code liability

Community mirrors only address 1 while App Engine addresses 2 and a rewrite
addresses 3.  And I think App Engine would be significantly more reliable
than PyPI with mirrors.  It's less moving parts, and it's built on
infrastructure that is highly automated.  Also because it requires less
maintenance, if someone drops out of communication for a while or goes on
vacation or something, it's not something that needs active tending.

There's a significant number of failure conditions that a mirror network
doesn't protect you from.  Connection refused, connection timed out, and 500
errors are the only really obvious errors that will make a tool look to the
next mirror.  Because of potential synchronization problems there's a lot of
new problems a mirror network could introduce.

 Although, if PyPI was to be ported to GAE, couldn't we reuse the
> existing code instead of rewriting from scratch ? we would just have
> to rewrite the DB layer.
>

I don't think it's worth reusing that code.

 > Besides a commitment to using the code (which I think is really important
> to
> > motivate people), a scrubbed dump of the database would be really helpful
> > for development.  I know we've passed around complete dumps to people,
> but
> > it contains private information so we can't put it up publicly which
> creates
> > a speed bump for developers.
>
> Private information could be easily removed from those dumps;
>
> But I don't think it's so helpful since you have all the .sql scripts to
> create
> your own DB. But we could add a script to create some sample data on
> the top of those scripts. <http://ziade.org>
>

It's useful to have a representative data set to test with, especially for
stuff like performance testing.

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100618/42d95f40/attachment.html>

From martin at v.loewis.de  Sat Jun 19 02:18:06 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sat, 19 Jun 2010 02:18:06 +0200
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <AANLkTimfRBpUr3uTBhoD9DbR5V46mWaSvzGV9xNKp2uX@mail.gmail.com>
References: <AANLkTikBv7-cLchCk4Zo1CH7I69iO8GOon7eOndE6x48@mail.gmail.com>
	<AANLkTikvLLni5t9E_8nz7oR27rRyZYVngiSfWPFCoy8g@mail.gmail.com>
	<AANLkTimfRBpUr3uTBhoD9DbR5V46mWaSvzGV9xNKp2uX@mail.gmail.com>
Message-ID: <4C1C0CBE.3010305@v.loewis.de>

> It's useful to have a representative data set to test with, especially
> for stuff like performance testing.

Couldn't that be obtained through one of the many mirroring libraries?

If it's going to be a complete rewrite, anyway, I doubt that a dump 
according to the current db schema would help.

Regards,
Martin

From martin at v.loewis.de  Sat Jun 19 02:20:34 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 19 Jun 2010 02:20:34 +0200
Subject: [Catalog-sig] [Proposal] Registered packages must provide the
 source code distribution on PyPI
In-Reply-To: <20100618235804.0B9DA3A40A5@sparrow.telecommunity.com>
References: <4C19A308.5040806@zopyx.com> <4C19C7A0.9080800@v.loewis.de>
	<AANLkTikcyVM8BsJK7n9uDNqSrOXDCgjevCYycBw3xH6L@mail.gmail.com>
	<hvdd8p$tmb$1@dough.gmane.org>
	<AANLkTike8MpGVOi2PjDJAhM2zFBrDiOI_W7V6yqI0Hbc@mail.gmail.com>
	<4C1A607E.2030904@zopyx.com>
	<AANLkTilqematflkFtlkxGtmPTHbTfjLo11rnnDKZqKi4@mail.gmail.com>
	<4C1A6383.80105@zopyx.com>
	<AANLkTikU2E6yPUGSlD06WwLm9W21Oxe-LwfflydI1npr@mail.gmail.com>
	<AANLkTinwQRX6dOUq559nYpEX028zTj5-Bfg4I1giDvds@mail.gmail.com>
	<4C1A96A6.3050101@v.loewis.de>
	<AANLkTinzXNTK7y200lvFdfqNmN-2Pfsu-BxrR1Y7ihT_@mail.gmail.com>
	<4C1BFC47.8060301@v.loewis.de>
	<20100618235804.0B9DA3A40A5@sparrow.telecommunity.com>
Message-ID: <4C1C0D52.6020404@v.loewis.de>

Am 19.06.2010 01:57, schrieb P.J. Eby:
> At 01:07 AM 6/19/2010 +0200, Martin v. L?wis wrote:
>> Am 18.06.2010 18:47, schrieb Mark Ramm:
>>> On Thu, Jun 17, 2010 at 5:41 PM, "Martin v.
>>> L?wis"<martin at v.loewis.de> wrote:
>>>>> It does? I thought PyPI kept everything around (but hidden) unless the
>>>>> author went in and manually deleted old stuff. You just need to go
>>>>> to a
>>>>> deep link, e.g., http://pypi.python.org/pypi/SomePackage/0.1
>>>>
>>>> Sure, but owners *do* manually delete old stuff.
>>>
>>> Am I wrong in remembering that old packages get dropped from the
>>> simple index?
>>
>> You are indeed misremembering. They used to, but don't, any longer, on
>> user request.
>
> How many users?

It's been a long time; all I remember is that the users were massively 
(i.e. strongly, forcefully) demanding it, and there was no objection.
I guess you can find the discussion in the archives.

Regards,
Martin

From ziade.tarek at gmail.com  Sat Jun 19 02:55:29 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Sat, 19 Jun 2010 02:55:29 +0200
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <AANLkTimfRBpUr3uTBhoD9DbR5V46mWaSvzGV9xNKp2uX@mail.gmail.com>
References: <AANLkTikBv7-cLchCk4Zo1CH7I69iO8GOon7eOndE6x48@mail.gmail.com>
	<AANLkTikvLLni5t9E_8nz7oR27rRyZYVngiSfWPFCoy8g@mail.gmail.com>
	<AANLkTimfRBpUr3uTBhoD9DbR5V46mWaSvzGV9xNKp2uX@mail.gmail.com>
Message-ID: <AANLkTilubJXQy_y0oZ_WQOCM8tmNcgThF4FYH_InluNR@mail.gmail.com>

On Sat, Jun 19, 2010 at 1:58 AM, Ian Bicking <ianb at colorstudy.com> wrote:
..
>> Out of curiosity : have you ever worked with the current implementation ?
>>
>> I have hard time to understand why some people say it's hard to work with
>> it,
>> I don't think its a valid argument.
>
> I haven't looked at it in years, but I've poked around it some.? I found it
> difficult, yes.

Having worked with both code bases, it's much more simple that Pip, but suffers
from the same syndromes : some modules just grew too big, and there
are not enough tests  ;)

PyPI has for instance a huge webui.py file, which should be cut in pieces.

..
>> Even if I don't think its a priority in our concerns (community
>> mirrors come first), I wouldn't mind having the main PyPI UI in GAE.
>
> The priorities that motivate me are:
>
> 1. Make installation more reliable with respect to PyPI
> 2. Decrease overall maintenance burden
> 3. Decrease code liability
>
> Community mirrors only address 1 while App Engine addresses 2 and a rewrite
> addresses 3.

I agree with 2. but I don't understand 3. why AppEngine would decrease
code liability ?
a code can be liable in any environment.

> There's a significant number of failure conditions that a mirror network
> doesn't protect you from.? Connection refused, connection timed out, and 500
> errors are the only really obvious errors that will make a tool look to the
> next mirror.? Because of potential synchronization problems there's a lot of
> new problems a mirror network could introduce.

a mirror network is not the silver bullet, but I don't think the
number of failure conditions
is more significant than another solution. As a matter of fact,
potential synchronization problems should be addressed by the
mirroring protocol itself, if you think of any use case now, or if we
meet one later. but, the main use case from the client PoV : "the
sever is down" is fixed by falling back to another server.

>> Although, if PyPI was to be ported to GAE, couldn't we reuse the
>> existing code instead of rewriting from scratch ? we would just have
>> to rewrite the DB layer.
>
> I don't think it's worth reusing that code.

Why that ? As a contributor to the project, I know this will take some
time to be rewritten, even if the application is not big. Features are
still added in it. So rewriting something from scratch strikes me as a
bad idea.

Here are the priority I think we should take to solve the issues we
had with PyPI:

1. investigate deeper in why the PyPI server was down for some hours
2. make sure PyPI has more sysadmins in several timezones
3. make the existing mirrors, "official" mirrors, via PEP 381

1. 2. can be done right now.

Regards
Tarek

From mcrute at gmail.com  Sat Jun 19 03:23:54 2010
From: mcrute at gmail.com (Michael Crute)
Date: Fri, 18 Jun 2010 21:23:54 -0400
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <AANLkTikvLLni5t9E_8nz7oR27rRyZYVngiSfWPFCoy8g@mail.gmail.com>
References: <AANLkTikBv7-cLchCk4Zo1CH7I69iO8GOon7eOndE6x48@mail.gmail.com>
	<AANLkTikvLLni5t9E_8nz7oR27rRyZYVngiSfWPFCoy8g@mail.gmail.com>
Message-ID: <C4CBF5DC-24A1-4E8B-A8A4-E1F79DBAE03D@gmail.com>

On Jun 18, 2010, at 7:27 PM, Tarek Ziad? <ziade.tarek at gmail.com> wrote:
> On Fri, Jun 18, 2010 at 6:44 PM, Ian Bicking <ianb at colorstudy.com> wrote:
>> Of course, this means writing code, etc., but I believe this is a reasonable
>> goal.  I think if "we" (Catalog-SIG?  PyPI maintainers?) committed to using
>> such an implementation (assuming it was of good quality) that we could find
>> people (probably not on this list) to write and maintain the code.  People
>> have already rewritten PyPI a couple times, but no one knows what exactly to
>> *do* with the rewrite so they haven't gone anywhere.  And PyPI is not a
>> particularly complicated application.  I think we can set the bar high on
>> the implementation quality and that people will meet it, so long as they
>> know their effort won't be in vain.
> 
> Out of curiosity : have you ever worked with the current implementation ?
> 
> I have hard time to understand why some people say it's hard to work with it,
> I don't think its a valid argument.

I briefly played with the current implementation and found it somewhat difficult to work with. Part of the problem is that the code is dated and not well tested. The other part of the problem is that there are too many dependencies and replicating the environment required to run the official code is somewhat painful. For my uses I really don't want to run postgres just to serve a version of the cheeseshop. A project like chishop eliminates many of these problems as it's main dependency is Django which is designed to make setting up the application simple and allows you to chose what kind of database you want from something very simple like sqlite all the way up to something more robust like postgres. 

From mcrute at gmail.com  Sat Jun 19 03:26:44 2010
From: mcrute at gmail.com (Michael Crute)
Date: Fri, 18 Jun 2010 21:26:44 -0400
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <4C1C068A.4000607@v.loewis.de>
References: <AANLkTikBv7-cLchCk4Zo1CH7I69iO8GOon7eOndE6x48@mail.gmail.com>
	<AANLkTimsSO_YRJDuuyIMcIy7UfX6vIJx1rngiQiFPUi4@mail.gmail.com>
	<4C1BF77C.10306@v.loewis.de>
	<AANLkTil3cZX4pg3naYpoLEpdMJJWm5Kro36_uFac62s-@mail.gmail.com>
	<4C1C068A.4000607@v.loewis.de>
Message-ID: <BE14CCED-D5BF-442C-8FB8-D927C4912A6E@gmail.com>

On Jun 18, 2010, at 7:51 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> 
>> I'm maintaining a todo list within my fork at
>> http://github.com/mcrute/chishop/blob/master/TODO and would very much
>> appreciate any input you might have as to which features are most
>> important for official compatibility and what is missing from that
>> list.
> 
> The absolute requirement is that any URLs that PyPI provides must work exactly the same way. Primarily, this means
> - package pages
> - browse interface
> - RSS
> 
> In addition, a number of features aren't listed yet:
> - web registration of users
> - web password reset
> - OpenID support

Thanks, I'll update my notes accordingly. Are there specs for any of these protocols? A few things are codified in PEPs but the rest seem to just require research into the code that currently implements the functionality. 

From ziade.tarek at gmail.com  Sat Jun 19 03:31:26 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Sat, 19 Jun 2010 03:31:26 +0200
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <C4CBF5DC-24A1-4E8B-A8A4-E1F79DBAE03D@gmail.com>
References: <AANLkTikBv7-cLchCk4Zo1CH7I69iO8GOon7eOndE6x48@mail.gmail.com>
	<AANLkTikvLLni5t9E_8nz7oR27rRyZYVngiSfWPFCoy8g@mail.gmail.com>
	<C4CBF5DC-24A1-4E8B-A8A4-E1F79DBAE03D@gmail.com>
Message-ID: <AANLkTinJKPuQrSCFP8KgUYYBXS08-k98EcPOtbgruIuu@mail.gmail.com>

On Sat, Jun 19, 2010 at 3:23 AM, Michael Crute <mcrute at gmail.com> wrote:
> On Jun 18, 2010, at 7:27 PM, Tarek Ziad? <ziade.tarek at gmail.com> wrote:
>> On Fri, Jun 18, 2010 at 6:44 PM, Ian Bicking <ianb at colorstudy.com> wrote:
>>> Of course, this means writing code, etc., but I believe this is a reasonable
>>> goal. ?I think if "we" (Catalog-SIG? ?PyPI maintainers?) committed to using
>>> such an implementation (assuming it was of good quality) that we could find
>>> people (probably not on this list) to write and maintain the code. ?People
>>> have already rewritten PyPI a couple times, but no one knows what exactly to
>>> *do* with the rewrite so they haven't gone anywhere. ?And PyPI is not a
>>> particularly complicated application. ?I think we can set the bar high on
>>> the implementation quality and that people will meet it, so long as they
>>> know their effort won't be in vain.
>>
>> Out of curiosity : have you ever worked with the current implementation ?
>>
>> I have hard time to understand why some people say it's hard to work with it,
>> I don't think its a valid argument.
>
> I briefly played with the current implementation and found it somewhat difficult to work with. Part of the problem is that the code is dated and not well tested. The other part of the problem is that there are too many dependencies and replicating the environment required to run the official code is somewhat painful. For my uses I really don't want to run postgres just to serve a version of the cheeseshop. A project like chishop eliminates many of these problems as it's main dependency is Django which is designed to make setting up the application simple and allows you to chose what kind of database you want from something very simple like sqlite all the way up to something more robust like postgres.

Right, switching to something like SQLAlchemy would be better



-- 
Tarek Ziad? | http://ziade.org

From martin at v.loewis.de  Sat Jun 19 10:10:30 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sat, 19 Jun 2010 10:10:30 +0200
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <C4CBF5DC-24A1-4E8B-A8A4-E1F79DBAE03D@gmail.com>
References: <AANLkTikBv7-cLchCk4Zo1CH7I69iO8GOon7eOndE6x48@mail.gmail.com>	<AANLkTikvLLni5t9E_8nz7oR27rRyZYVngiSfWPFCoy8g@mail.gmail.com>
	<C4CBF5DC-24A1-4E8B-A8A4-E1F79DBAE03D@gmail.com>
Message-ID: <4C1C7B76.7020409@v.loewis.de>


> I briefly played with the current implementation and found it
> somewhat difficult to work with. Part of the problem is that the code
> is dated

Can you please explain what that means? What is "dated code", how do you 
recognize it, and why does it make it difficult to work with?

> The other part of the problem is that
> there are too many dependencies and replicating the environment
> required to run the official code is somewhat painful. For my uses I
> really don't want to run postgres just to serve a version of the
> cheeseshop.

Hmm. If setting up postgres is already considered a burden, I guess
I understand the problem. However, dependency-wise, I'd argue that
PyPI fares much better than many of the packages on PyPI. It's list
of dependencies is really short.

Regards,
Martin

From martin at v.loewis.de  Sat Jun 19 10:12:28 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sat, 19 Jun 2010 10:12:28 +0200
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <BE14CCED-D5BF-442C-8FB8-D927C4912A6E@gmail.com>
References: <AANLkTikBv7-cLchCk4Zo1CH7I69iO8GOon7eOndE6x48@mail.gmail.com>
	<AANLkTimsSO_YRJDuuyIMcIy7UfX6vIJx1rngiQiFPUi4@mail.gmail.com>
	<4C1BF77C.10306@v.loewis.de>
	<AANLkTil3cZX4pg3naYpoLEpdMJJWm5Kro36_uFac62s-@mail.gmail.com>
	<4C1C068A.4000607@v.loewis.de>
	<BE14CCED-D5BF-442C-8FB8-D927C4912A6E@gmail.com>
Message-ID: <4C1C7BEC.4090506@v.loewis.de>

> Thanks, I'll update my notes accordingly. Are there specs for any of
> these protocols?

No. If you ask for a specific spec, I can write one in an email message, 
though.

> A few things are codified in PEPs but the rest seem
> to just require research into the code that currently implements the
> functionality.

No. I'd rather recommend using PyPI, and locating these features. It's
straight-forward to derive a spec for most of them in a blackbox fashion.

Regards,
Martin

From g.brandl at gmx.net  Sat Jun 19 12:01:35 2010
From: g.brandl at gmx.net (Georg Brandl)
Date: Sat, 19 Jun 2010 12:01:35 +0200
Subject: [Catalog-sig] Mercurial
In-Reply-To: <4C192BA7.8010202@netwok.org>
References: <4C121377.4000008@simplistix.co.uk>	<AANLkTikucdIUdwC4x2i5u7yL_ih0XglXwUxAXO_HaWVb@mail.gmail.com>	<4C127DD4.5010801@v.loewis.de>	<AANLkTinzkgBopF3clE-au6sbPvI9sQKLe9GHc4GGRGla@mail.gmail.com>	<4C12A2E4.2090305@v.loewis.de>	<4C12A54D.1070406@egenix.com>	<AANLkTikWNTRcLV7RsnMtpuo-OJFHPbeVrdogx4nxlPzS@mail.gmail.com>	<4C14D8E8.4010903@egenix.com>	<AANLkTimMhm4Yq9CuiWTxasMGMWKBfV6Yoo6GCTln6rRb@mail.gmail.com>	<4C15F5F3.40501@egenix.com>	<AANLkTineOzOP7Bb9JiAVPulkExOOyLjP8yHTLzmJMPkc@mail.gmail.com>	<4C176BD4.3080909@egenix.com>	<AANLkTilsh8ymbJ6JRzF3QxMe0Rv9K-x3_WKZ-wG4_uht@mail.gmail.com>	<4C17CE55.5000601@v.loewis.de>	<AANLkTilaoMa0OEkwcRzlL2ild5C4fGVf1hg-C5lsQmBu@mail.gmail.com>	<4C17F065.7070309@v.loewis.de>
	<4C192BA7.8010202@netwok.org>
Message-ID: <hvi4kl$v0m$1@dough.gmane.org>

Am 16.06.2010 21:53, schrieb ?ric Araujo:
>> After using Mercurial in one project, I'm skeptical that this really 
>> makes things simpler. I find it very hard to find out what changes a 
>> specific clone has that I still need to integrate.
> 
> There are commands to compare repositories: incoming and outgoing (read
> ?hg help incoming?).
> 
>> Also, when merging with conflicts, I find it very difficult to determine
>> whether I merged all the conflicts correctly (since the diff will show
>> all changes, not just the conflicts).
> 
> I believe that?s a known bug. David Wolever is writing an extension to
> show only the diff against the automated merge, which would be more
> helpful: http://mercurial.selenic.com/wiki/MergediffExtension
> Bitbucket uses a similar algo to display merge diffs, I think.

I can understand that the behavior of mergediff is useful sometimes, but
the "default" one is what I want most of the time, for example when pulling
changes from a fork of Sphinx.  I need to make sure that all "new code"
coming from the other branch integrates well with whatever may have changed
between the fork and the merge, and that includes locations without conflict.

Of course, one gets huge diffs soon, but there is always the possibility of
"splitting" merges, i.e. for a changeset graph (after pulling) looking
like this

      A1  A2  M
 /----o---o---O
-            /
 \--o--o--o-/
    B1 B2 B3

where A are your changes and B the other branch's, and M the merge commit,
yout can merge a number of earlier versions so that it looks like this:

      A1  A2  M1   M2
 /----o---o---O----O
-            /    /
 \--o--o----+--o-/
    B1 B2      B3

If these merge points are chosen suitably (after logical units), this can
make merging much less painful.

Just a data point,
Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From marrakis at gmail.com  Sat Jun 19 12:19:50 2010
From: marrakis at gmail.com (Mathieu Leduc-Hamel)
Date: Sat, 19 Jun 2010 12:19:50 +0200
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <4C1C7BEC.4090506@v.loewis.de>
References: <AANLkTikBv7-cLchCk4Zo1CH7I69iO8GOon7eOndE6x48@mail.gmail.com>
	<AANLkTimsSO_YRJDuuyIMcIy7UfX6vIJx1rngiQiFPUi4@mail.gmail.com>
	<4C1BF77C.10306@v.loewis.de>
	<AANLkTil3cZX4pg3naYpoLEpdMJJWm5Kro36_uFac62s-@mail.gmail.com>
	<4C1C068A.4000607@v.loewis.de>
	<BE14CCED-D5BF-442C-8FB8-D927C4912A6E@gmail.com>
	<4C1C7BEC.4090506@v.loewis.de>
Message-ID: <AANLkTik72UhhETUPelVRPEwYPEEG0y5iPqawulWH81KA@mail.gmail.com>

For list of dependencies of PyPI is not that big, I would like to add some
new ones in the futur like as said Tarek, SQLAlchemy.

For the different problems of the current code base, we are already started
to work on these.

- For unittesting, the test coverage is now around 40% and it grow each week
- The 2 big module store.py and webui.py we are starting to split them into
multiple logical module organisation.
- After the test implantation we are planning to do switch to sqlalchemy
which will allow us to make it easier for contributor, for tester and will
simplify the code in the store.py module.

I don't see any justification to ditch the current code base since it's work
! We need to make sure everything is working properly, simplify it and clean
it, but switching to something will take time and it will be the eternal
debate, which framework, which database, which server... we've already start
it.

Oh and by the apart from the sqldump pypi is now pretty easy to install, we
put a buildout in it and using paster you lauch it.. that's all...
On Sat, Jun 19, 2010 at 10:12 AM, "Martin v. L?wis" <martin at v.loewis.de>wrote:

> Thanks, I'll update my notes accordingly. Are there specs for any of
>> these protocols?
>>
>
> No. If you ask for a specific spec, I can write one in an email message,
> though.
>
>
>  A few things are codified in PEPs but the rest seem
>> to just require research into the code that currently implements the
>> functionality.
>>
>
> No. I'd rather recommend using PyPI, and locating these features. It's
> straight-forward to derive a spec for most of them in a blackbox fashion.
>
> Regards,
> Martin
>
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100619/bb7f8f72/attachment.html>

From solipsis at pitrou.net  Sat Jun 19 14:32:55 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 19 Jun 2010 12:32:55 +0000 (UTC)
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
References: <AANLkTikBv7-cLchCk4Zo1CH7I69iO8GOon7eOndE6x48@mail.gmail.com>
Message-ID: <loom.20100619T143029-538@post.gmane.org>

Ian Bicking <ianb <at> colorstudy.com> writes:
> 
> With all the reliability discussion, I thought I'd offer a kind of
> counterproposal, that we rewrite PyPI to use App Engine.

How reasonable is it to base PyPI on a third-party proprietary platform,
infrastructure and API?
Shouldn't this kind of decision at least require something such as PSF approval?

Thanks

Antoine.



From martin at v.loewis.de  Sat Jun 19 17:58:54 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 19 Jun 2010 17:58:54 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability
In-Reply-To: <AANLkTikD4EiiOWcsRWWW6btxOQk2gO6EPs-Xa69fRPao@mail.gmail.com>
References: <4C1768AF.9040606@egenix.com>	<4C17B6CE.20209@jcea.es>	<4C17BC38.6090208@egenix.com>	<4C17C4B5.3000801@jcea.es>	<AANLkTik-1Z6vecE5b83fKNA-qo687EiQUQDq5CIM5Oo5@mail.gmail.com>	<4C17F6D4.2050504@jcea.es>	<AANLkTilH12k89h5SwCW8CKUue3JASTESmzzxB_l-IOAU@mail.gmail.com>	<4C1804ED.8030708@v.loewis.de>	<AANLkTimRYb_72Miaya4uu1cUQKdWroXb7gxPhxpBH3ur@mail.gmail.com>	<4C186AB6.2030407@v.loewis.de>
	<AANLkTikD4EiiOWcsRWWW6btxOQk2gO6EPs-Xa69fRPao@mail.gmail.com>
Message-ID: <4C1CE93E.8070402@v.loewis.de>

> A simple way to protect against just the issue you mentioned is to
> have the clients retrieve the key over HTTPS or distribute the key
> with the client.

Ok. I have now enabled https for PyPI (https://pypi.python.org/pypi)

> Okay.   We'd be happy to work with you to get an easy solution put in
> place.

Thanks for the offer. Notice that this project is primarily about 
mirroring; other issues (should they exist) preferably should be dealt 
with separately.

> TUF is fairly early stage (our first major deployment is on going),
> but might be worth consideration.   I think we could probably put
> together a quick demo so that you and others could see how it might
> work with one of the existing client updaters.

I don't think adding another dependency to the clients is really 
acceptable. Instead, it must all be self-contained.

Regards,
Martin

From ianb at colorstudy.com  Sat Jun 19 18:24:12 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Sat, 19 Jun 2010 11:24:12 -0500
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <loom.20100619T143029-538@post.gmane.org>
References: <AANLkTikBv7-cLchCk4Zo1CH7I69iO8GOon7eOndE6x48@mail.gmail.com> 
	<loom.20100619T143029-538@post.gmane.org>
Message-ID: <AANLkTil6Bkj6uj-Nc3uMUyOqYNPzQpUa1DbhuQlcnJBk@mail.gmail.com>

On Sat, Jun 19, 2010 at 7:32 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:

> Ian Bicking <ianb <at> colorstudy.com> writes:
> >
> > With all the reliability discussion, I thought I'd offer a kind of
> > counterproposal, that we rewrite PyPI to use App Engine.
>
> How reasonable is it to base PyPI on a third-party proprietary platform,
> infrastructure and API?
> Shouldn't this kind of decision at least require something such as PSF
> approval?
>

Yes, that seems reasonable, though this SIG would be the first step
regardless.

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100619/4842a0d9/attachment-0001.html>

From justinc at cs.washington.edu  Sat Jun 19 20:24:00 2010
From: justinc at cs.washington.edu (Justin Cappos)
Date: Sat, 19 Jun 2010 11:24:00 -0700
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability
In-Reply-To: <4C1CE93E.8070402@v.loewis.de>
References: <4C1768AF.9040606@egenix.com> <4C17B6CE.20209@jcea.es>
	<4C17BC38.6090208@egenix.com> <4C17C4B5.3000801@jcea.es>
	<AANLkTik-1Z6vecE5b83fKNA-qo687EiQUQDq5CIM5Oo5@mail.gmail.com>
	<4C17F6D4.2050504@jcea.es>
	<AANLkTilH12k89h5SwCW8CKUue3JASTESmzzxB_l-IOAU@mail.gmail.com>
	<4C1804ED.8030708@v.loewis.de>
	<AANLkTimRYb_72Miaya4uu1cUQKdWroXb7gxPhxpBH3ur@mail.gmail.com>
	<4C186AB6.2030407@v.loewis.de>
	<AANLkTikD4EiiOWcsRWWW6btxOQk2gO6EPs-Xa69fRPao@mail.gmail.com>
	<4C1CE93E.8070402@v.loewis.de>
Message-ID: <AANLkTimxbQBVDJQcSOmOhqAPdcro9pQo4ManvWpJCnZt@mail.gmail.com>

On Sat, Jun 19, 2010 at 8:58 AM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> A simple way to protect against just the issue you mentioned is to
>> have the clients retrieve the key over HTTPS or distribute the key
>> with the client.
>
> Ok. I have now enabled https for PyPI (https://pypi.python.org/pypi)

Great.   Assuming cert checking is implemented properly for the
clients who retrieve your server's key, this will protect against many
simple attacks.

> I don't think adding another dependency to the clients is really acceptable.
> Instead, it must all be self-contained.

Okay, sounds good.   We'll look elsewhere!

Thanks,
Justin

From mal at egenix.com  Mon Jun 21 12:57:42 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 21 Jun 2010 12:57:42 +0200
Subject: [Catalog-sig] Extra links on the PyPI /simple index package
	pages
In-Reply-To: <20100618210449.8621B3A414B@sparrow.telecommunity.com>
References: <4C19A308.5040806@zopyx.com>
	<4C19D4CA.1090304@egenix.com>	<4C19D745.3050900@zopyx.com>
	<4C19DCA9.5010308@egenix.com>	<4C19DF6F.9050106@zopyx.com>
	<4C19E409.8060603@egenix.com>	<4C19E54F.6030203@zopyx.com>
	<4C19F011.6010501@egenix.com>	<hvctat$k9$1@dough.gmane.org>
	<4C19FD2A.3050801@egenix.com>	<AANLkTinkFyTEcJhzv0BpRJgAapc6gNfUpEtIhrZbm2W4@mail.gmail.com>	<4C1A0992.7070507@egenix.com>	<AANLkTikczowGSN5jgGA6l19opbzNpnyJppiVGDFVTxCV@mail.gmail.com>	<4C1A201F.6080609@egenix.com>
	<4C1A9487.5070108@v.loewis.de>	<4C1B3813.3010102@egenix.com>
	<20100618210449.8621B3A414B@sparrow.telecommunity.com>
Message-ID: <4C1F45A6.6070503@egenix.com>

P.J. Eby wrote:
> At 11:10 AM 6/18/2010 +0200, M.-A. Lemburg wrote:
>> "Martin v. L?wis" wrote:
>> > Am 17.06.2010 15:16, schrieb M.-A. Lemburg:
>> >> Benji York wrote:
>> >>> On Thu, Jun 17, 2010 at 7:40 AM, M.-A. Lemburg<mal at egenix.com> 
>> wrote:
>> >>>> http://pypi.python.org/simple/zc.buildout/
>> >>>>
>> >>>> BTW: what are all those bug links doing on the zc.buildout index
>> page ?
>> >>>
>> >>> PyPI scrapes all the links from the long description; for many
>> projects
>> >>> that includes a change log with links to fixed bugs.
>> >>
>> >> Isn't that dangerous ?
>> >>
>> >> AFAIK, setuptools would start opening all those URLs and might
>> >> find download files which are not necessarily under full control of
>> >> the author, e.g. anyone could add a comment to a bug report or
>> >> wiki page with a link to an egg file on some rogue server.
>> >
>> > I think you misunderstand. Links originate *only* from the long
>> > description. The package owner has full control over that.
>>
>> I was referring to the linked assets that the package owner
>> may not have full control over, e.g. in the above case,
>> you have links pointing to launchpad and one to "file://".
>>
>> Such links (except the file:// one) can be useful in the
>> package description, e.g. to point to a bug tracking
>> system, documentation or other resources, but they are
>> not really needed to point setuptools to download locations.
> 
> This is a misunderstanding of what setuptools does.  Setuptools only
> retrieves URLs that are *specifically designated* as a "home page" or
> "download" link (using the "rel" field of the A tag on the PyPI /simple
> page), or which are a recognizable download URL supplied by way of the
> long_description.
> 
> So, the risk you are describing does not actually exist.
> 
> 
>> > If you think the package owner is opening up a security threat by
>> > including the links in the first place - yes, that's indeed a risk.
>>
>> Is this feature still needed for setuptools ?
> 
> Yes.
> 
> 
>> We have download URLs and homepage URLs which should be enough for
>> setuptools to search and find the links to package download files.
> 
> No.  This would only be the case if the project's author had some other
> form of hosting.  For example, if you had a subversion repository for
> your development trunk, but didn't have any place to host an HTML page
> to link to it, the long_description would be the only way (AFAIK at
> present) for you to securely provide a link to that repository for
> setuptools (or humans) to use.

The author could setup the home page or download URL to point to
that repository (SVN makes the repos available as HTML pages as
well).

> See also:
> 
>  
> http://peak.telecommunity.com/DevCenter/setuptools#making-your-package-available-for-easyinstall
> 
> 
> and:
> 
>   http://peak.telecommunity.com/DevCenter/PackageIndexAPI
> 
> for more information on how the link parsing and retrieval works.
> 
> It is a common misconception that setuptools spiders pages for links;
> the truth is, it only reads the "home" and "download" URLs provided via
> the PyPI metadata, and those only if they're not obviously links to a
> package tarball (or zip, egg, etc.).  All other links must visibly point
> to something downloadable, or else they're ignored.

So in summary, the /simple index page doesn't need to include
any URLs from the long_description that do not have a rel
attribute set, or end with one of the fixed set of archive extensions
or with "#egg=...".

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 21 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                27 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From pje at telecommunity.com  Mon Jun 21 16:52:06 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Mon, 21 Jun 2010 10:52:06 -0400
Subject: [Catalog-sig] Extra links on the PyPI /simple index package
 pages
Message-ID: <20100621145209.76A903A404D@sparrow.telecommunity.com>

At 12:57 PM 6/21/2010 +0200, M.-A. Lemburg wrote:
>So in summary, the /simple index page doesn't need to include
>any URLs from the long_description that do not have a rel
>attribute set, or end with one of the fixed set of archive extensions
>or with "#egg=...".

Such links are ignored, yes.  (The 'rel' links are only generated by 
PyPI, btw, not from the long_description.)

OTOH, I'm not sure what benefit there is to adding code that would 
specifically filter things down to just those URLs, since adding code 
always adds the potential for bugs, and the presence of those links 
is currently harmless.

(Unless of course you're so bandwidth starved that an extra few 
hundred bytes of link text is a problem... in which case, you could 
likely save even *more* bytes by stripping off the '<a' tags and 
their contents, and just serve up a text file with a series of lines 
reading 'href="..."', since setuptools is actually only looking for 
href attributes, not the tags that contain them.  That would shave a 
significant chunk of bytes off every page, not just the ones with extra links!)


From almir at almirkaric.com  Thu Jun 24 23:24:20 2010
From: almir at almirkaric.com (Almir Karic)
Date: Thu, 24 Jun 2010 14:24:20 -0700
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
Message-ID: <AANLkTinmWpcE7Ko66gaEI_iPm3rqwKVA7LKFAV5sWkPJ@mail.gmail.com>

i would like to help out with the move.

is anyone actually opposed to moving to GAE (either moving the current
code base or re-write, whichever seems more appropriate)?

-- 
python/django hacker & sys admin
http://almirkaric.com & http://twitter.com/redduck666

From mal at egenix.com  Fri Jun 25 00:16:15 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 25 Jun 2010 00:16:15 +0200
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <AANLkTinmWpcE7Ko66gaEI_iPm3rqwKVA7LKFAV5sWkPJ@mail.gmail.com>
References: <AANLkTinmWpcE7Ko66gaEI_iPm3rqwKVA7LKFAV5sWkPJ@mail.gmail.com>
Message-ID: <4C23D92F.5060105@egenix.com>

Almir Karic wrote:
> i would like to help out with the move.
> 
> is anyone actually opposed to moving to GAE (either moving the current
> code base or re-write, whichever seems more appropriate)?

I don't think people are opposed to having a PyPI clone on GAE,
but moving the existing installation to GAE is something we would
have to discuss separately.

I for one would not welcome such a change, since we then completely
lose control over service availability.

Someone would also have to do some math to calculate the monthly
costs for the PSF:

    http://code.google.com/appengine/docs/quotas.html
    http://code.google.com/appengine/docs/billing.html
    http://code.google.com/appengine/business/

Please do consider helping on the already proposed PyPI
enhancements and changes.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 25 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                23 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From noah at coderanger.net  Fri Jun 25 00:14:36 2010
From: noah at coderanger.net (Noah Kantrowitz)
Date: Thu, 24 Jun 2010 15:14:36 -0700
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <AANLkTinmWpcE7Ko66gaEI_iPm3rqwKVA7LKFAV5sWkPJ@mail.gmail.com>
References: <AANLkTinmWpcE7Ko66gaEI_iPm3rqwKVA7LKFAV5sWkPJ@mail.gmail.com>
Message-ID: <233601cb13ea$a2407780$e6c16680$@net>

Moving the current codebase wouldn't be possible given the direct usage of
Postges for the database. I think you will find strong resistance to
anything involving a rewrite given recent discussions.

--Noah

> -----Original Message-----
> From: catalog-sig-bounces+noah=coderanger.net at python.org
> [mailto:catalog-sig-bounces+noah=coderanger.net at python.org] On Behalf
> Of Almir Karic
> Sent: Thursday, June 24, 2010 2:24 PM
> To: catalog-sig at python.org
> Subject: [Catalog-sig] Rewrite PyPI for App Engine?
> 
> i would like to help out with the move.
> 
> is anyone actually opposed to moving to GAE (either moving the current
> code base or re-write, whichever seems more appropriate)?
> 
> --
> python/django hacker & sys admin
> http://almirkaric.com & http://twitter.com/redduck666
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig


From ianb at colorstudy.com  Fri Jun 25 01:37:41 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu, 24 Jun 2010 18:37:41 -0500
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <233601cb13ea$a2407780$e6c16680$@net>
References: <AANLkTinmWpcE7Ko66gaEI_iPm3rqwKVA7LKFAV5sWkPJ@mail.gmail.com> 
	<233601cb13ea$a2407780$e6c16680$@net>
Message-ID: <AANLkTinTZQObpKE1rZRKKmt_TqRGfuHGvDPqaw-blFil@mail.gmail.com>

On Thu, Jun 24, 2010 at 5:14 PM, Noah Kantrowitz <noah at coderanger.net>wrote:

> Moving the current codebase wouldn't be possible given the direct usage of
> Postges for the database. I think you will find strong resistance to
> anything involving a rewrite given recent discussions.
>

My memory of the ORM used in PyPI is that it is relatively non-relational (I
think it's based on Roundup's, which maybe supported non-relational
backends?)

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100624/587e9155/attachment.html>

From noah at coderanger.net  Fri Jun 25 01:40:28 2010
From: noah at coderanger.net (Noah Kantrowitz)
Date: Thu, 24 Jun 2010 16:40:28 -0700
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <AANLkTinTZQObpKE1rZRKKmt_TqRGfuHGvDPqaw-blFil@mail.gmail.com>
References: <AANLkTinmWpcE7Ko66gaEI_iPm3rqwKVA7LKFAV5sWkPJ@mail.gmail.com>
	<233601cb13ea$a2407780$e6c16680$@net>
	<AANLkTinTZQObpKE1rZRKKmt_TqRGfuHGvDPqaw-blFil@mail.gmail.com>
Message-ID: <233c01cb13f6$a12f8640$e38e92c0$@net>

PyPI uses an ORM? As far as I know it is just running SQL via psycopg2.

 

--Noah

 

From: ianbicking at gmail.com [mailto:ianbicking at gmail.com] On Behalf Of Ian Bicking
Sent: Thursday, June 24, 2010 4:38 PM
To: Noah Kantrowitz
Cc: Almir Karic; catalog-sig at python.org
Subject: Re: [Catalog-sig] Rewrite PyPI for App Engine?

 

On Thu, Jun 24, 2010 at 5:14 PM, Noah Kantrowitz <noah at coderanger.net> wrote:

Moving the current codebase wouldn't be possible given the direct usage of
Postges for the database. I think you will find strong resistance to
anything involving a rewrite given recent discussions.


My memory of the ORM used in PyPI is that it is relatively non-relational (I think it's based on Roundup's, which maybe supported non-relational backends?)

-- 
Ian Bicking  |  http://blog.ianbicking.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100624/55b08fff/attachment.html>

From ianb at colorstudy.com  Fri Jun 25 01:49:04 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu, 24 Jun 2010 18:49:04 -0500
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <4C23D92F.5060105@egenix.com>
References: <AANLkTinmWpcE7Ko66gaEI_iPm3rqwKVA7LKFAV5sWkPJ@mail.gmail.com> 
	<4C23D92F.5060105@egenix.com>
Message-ID: <AANLkTikMBLMcFRgVYc-sIcvSQapR8H96xrCf-Xma1Aq3@mail.gmail.com>

On Thu, Jun 24, 2010 at 5:16 PM, M.-A. Lemburg <mal at egenix.com> wrote:

> Almir Karic wrote:
> > i would like to help out with the move.
> >
> > is anyone actually opposed to moving to GAE (either moving the current
> > code base or re-write, whichever seems more appropriate)?
>
> I don't think people are opposed to having a PyPI clone on GAE,
> but moving the existing installation to GAE is something we would
> have to discuss separately.
>
> I for one would not welcome such a change, since we then completely
> lose control over service availability.
>

I don't really understand what this means.  Services become unavailable
sometimes.  A computer breaks, a company shuts down, an agreement ends.  We
don't necessarily have "control" over these situations, but we can respond
to them.  If App Engine goes down and the App Engine team is all like
"whatever, we'll get around to fixing stuff sometime" then sure it's a
problem.  But it's not a plausible problem.  The plausible problem is that
App Engine goes down, as it has from time to time, and we have to wait for
them to figure out what's wrong and fix it.  *We* don't have to fix it, we
only have to *wait for someone else to do it*.  I don't see any reason why
*we* are any better at fixing issues than the App Engine team would be.
Also presumably when there is a failure we want for the failure to be
understood and avoided in the future.  The App Engine team does that.  And
they do that *for us*.

In some catastrophic case we could move the site to another server, use
TyphoonAE to move the code over (or simply require that there is a
sufficient abstraction layer to allow for a more normal environment) and
bring the site up.  We control the domain, we can ultimately control where
it is hosted.  This kind of failure seems like it would be far more likely
given our current situation than on App Engine, but moving to App Engine
would not somehow make this kind of move impossible.

Someone would also have to do some math to calculate the monthly
> costs for the PSF:
>
>    http://code.google.com/appengine/docs/quotas.html
>    http://code.google.com/appengine/docs/billing.html
>    http://code.google.com/appengine/business/
>

It seems unlikely we'd have to pay for the service.

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100624/561dbd46/attachment-0001.html>

From solipsis at pitrou.net  Fri Jun 25 09:21:29 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 25 Jun 2010 07:21:29 +0000 (UTC)
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
References: <AANLkTinmWpcE7Ko66gaEI_iPm3rqwKVA7LKFAV5sWkPJ@mail.gmail.com>
Message-ID: <loom.20100625T091650-630@post.gmane.org>

Almir Karic <almir <at> almirkaric.com> writes:
> 
> i would like to help out with the move.
> 
> is anyone actually opposed to moving to GAE (either moving the current
> code base or re-write, whichever seems more appropriate)?

As I already said, I don't think it's reasonable to do it without first getting
the community's (and the PSF's) agreement that a vital Python infrastructure can
be managed under a proprietary API, platform and datastore.

I would myself be strongly opposed to such a move.

(and I don't even get the point, technically, of wanting to use GAE, which seems
to provide a crippled version of Python)

Regards

Antoine.



From noah at coderanger.net  Fri Jun 25 09:39:21 2010
From: noah at coderanger.net (Noah Kantrowitz)
Date: Fri, 25 Jun 2010 00:39:21 -0700
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <loom.20100625T091650-630@post.gmane.org>
References: <AANLkTinmWpcE7Ko66gaEI_iPm3rqwKVA7LKFAV5sWkPJ@mail.gmail.com>
	<loom.20100625T091650-630@post.gmane.org>
Message-ID: <CEA37824-3DA6-4828-908E-39C85F8F6099@coderanger.net>


On Jun 25, 2010, at 12:21 AM, Antoine Pitrou wrote:

> Almir Karic <almir <at> almirkaric.com> writes:
>> 
>> i would like to help out with the move.
>> 
>> is anyone actually opposed to moving to GAE (either moving the current
>> code base or re-write, whichever seems more appropriate)?
> 
> As I already said, I don't think it's reasonable to do it without first getting
> the community's (and the PSF's) agreement that a vital Python infrastructure can
> be managed under a proprietary API, platform and datastore.
> 
> I would myself be strongly opposed to such a move.
> 
> (and I don't even get the point, technically, of wanting to use GAE, which seems
> to provide a crippled version of Python)

GAE provides a professionally managed, "infinitely" scalable (or at least a heck of a lot more scalable than any other single server is likely to be, still not a substitute for mirrors), battle tested platform. There are already implementations of the GAE APIs that can be run independently, so I don't think it is quite as proprietary as you might think (though you do lose most of the benefits without having their services available, you just end up with yet another not-so-amazing web framework). I'm not saying that I think GAE is 100% the best path forward, but it certainly has a lot going for it. Also, while Google is real company and has its own business to attend to, they have almost always been an ally and partner to the Python community and would likely be willing to work with us moreso than, say, Amazon Web Services (Rackspace is also a big Python proponent though, and has cloud offerings similar to AWS). Similar arguments can be made for things like using S3/Cloudfront for content hosting, it isn't a replacement for mirroring, but it would allow the main server (or possibly a "primary mirror") to take advantage of these powerful services towards better uptime, responsiveness, management, etc. This isn't a slight against the current system, just pointing out that while we have a few volunteers taking care of the PyPI server, Google and GAE have dozens of people who keep GAE running smoothly as their full-time job.

--Noah

From solipsis at pitrou.net  Fri Jun 25 09:53:57 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 25 Jun 2010 07:53:57 +0000 (UTC)
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
References: <AANLkTinmWpcE7Ko66gaEI_iPm3rqwKVA7LKFAV5sWkPJ@mail.gmail.com>
	<loom.20100625T091650-630@post.gmane.org>
	<CEA37824-3DA6-4828-908E-39C85F8F6099@coderanger.net>
Message-ID: <loom.20100625T094453-63@post.gmane.org>

Noah Kantrowitz <noah <at> coderanger.net> writes:
> 
> GAE provides a professionally managed, "infinitely" scalable (or at least a
> heck of a lot more scalable
> than any other single server is likely to be, still not a substitute for
> mirrors), battle tested platform.

Infinite scalability is the new fashionable thing. But most websites can run on
a single server fine, and PyPI seems to be one of those.

As for "battle tested", the most popular frameworks are, as is SQLAlchemy, as is
Apache, as is PostgreSQL... I don't get what GAE buys in this area.

> There are already implementations of the GAE APIs that can be run
> independently, so I don't think it is
> quite as proprietary as you might think

Isn't it like chasing a moving target, though?
For an analogy, there are independent implementations of the Win32 APIs, but I'm
not sure anyone would trust Wine for running production services.

> Also, while Google is real
> company and has its own business to attend to, they have almost always been
> an ally and partner to the Python
> community and would likely be willing to work with us more
>  so than, say, Amazon Web Services (Rackspace is also a big Python proponent >
though, and has cloud offerings
> similar to AWS).

But the point in this discussion is not to try to pit the various service
providers one against another. It's to choose whether we want to rely on a
proprietary platform (modulo alternate implementations, see above), or on a
similarly battle-tested "standard" FLOSS-based stack.

And, assuming Google would like to provide servers and hosting, why wouldn't
they simply provide Linux servers on which to run Apache and anything else we
need to?

Regards

Antoine.



From noah at coderanger.net  Fri Jun 25 10:03:23 2010
From: noah at coderanger.net (Noah Kantrowitz)
Date: Fri, 25 Jun 2010 01:03:23 -0700
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <loom.20100625T094453-63@post.gmane.org>
References: <AANLkTinmWpcE7Ko66gaEI_iPm3rqwKVA7LKFAV5sWkPJ@mail.gmail.com>
	<loom.20100625T091650-630@post.gmane.org>
	<CEA37824-3DA6-4828-908E-39C85F8F6099@coderanger.net>
	<loom.20100625T094453-63@post.gmane.org>
Message-ID: <D0684B0E-2117-45BB-A498-E61E385B52BE@coderanger.net>


On Jun 25, 2010, at 12:53 AM, Antoine Pitrou wrote:

> Noah Kantrowitz <noah <at> coderanger.net> writes:
>> 
>> GAE provides a professionally managed, "infinitely" scalable (or at least a
>> heck of a lot more scalable
>> than any other single server is likely to be, still not a substitute for
>> mirrors), battle tested platform.
> 
> Infinite scalability is the new fashionable thing. But most websites can run on
> a single server fine, and PyPI seems to be one of those.
> 
> As for "battle tested", the most popular frameworks are, as is SQLAlchemy, as is
> Apache, as is PostgreSQL... I don't get what GAE buys in this area.
> 
>> There are already implementations of the GAE APIs that can be run
>> independently, so I don't think it is
>> quite as proprietary as you might think
> 
> Isn't it like chasing a moving target, though?
> For an analogy, there are independent implementations of the Win32 APIs, but I'm
> not sure anyone would trust Wine for running production services.
> 
>> Also, while Google is real
>> company and has its own business to attend to, they have almost always been
>> an ally and partner to the Python
>> community and would likely be willing to work with us more
>> so than, say, Amazon Web Services (Rackspace is also a big Python proponent >
> though, and has cloud offerings
>> similar to AWS).
> 
> But the point in this discussion is not to try to pit the various service
> providers one against another. It's to choose whether we want to rely on a
> proprietary platform (modulo alternate implementations, see above), or on a
> similarly battle-tested "standard" FLOSS-based stack.
> 
> And, assuming Google would like to provide servers and hosting, why wouldn't
> they simply provide Linux servers on which to run Apache and anything else we
> need to?

Its mostly a question of ongoing management. Apache+Linux+$SQLSERVER+etc can certainly handle our needs (which, lets face it, aren't really that complex), but we don't have a full-time management staff for our server. By leaning on Google (or Amazon, Rackspace, etc) we don't have to worry about the day-to-day details of running the site. How many of the recent PyPI downtimes have just required bouncing Apache? Wouldn't it have been nice if a site engineer got paged within 60 seconds and had it dealt with soon after instead of having to wait for one of the PyPI volunteers to notice and get to a computer? It isn't a question of capability, it is just where are our man-hours best spent: simple maintenance or actually improving the site?

--Noah

From mal at egenix.com  Fri Jun 25 10:39:45 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 25 Jun 2010 10:39:45 +0200
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <AANLkTikMBLMcFRgVYc-sIcvSQapR8H96xrCf-Xma1Aq3@mail.gmail.com>
References: <AANLkTinmWpcE7Ko66gaEI_iPm3rqwKVA7LKFAV5sWkPJ@mail.gmail.com>
	<4C23D92F.5060105@egenix.com>
	<AANLkTikMBLMcFRgVYc-sIcvSQapR8H96xrCf-Xma1Aq3@mail.gmail.com>
Message-ID: <4C246B51.9010700@egenix.com>

Ian Bicking wrote:
> On Thu, Jun 24, 2010 at 5:16 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> 
>> Almir Karic wrote:
>>> i would like to help out with the move.
>>>
>>> is anyone actually opposed to moving to GAE (either moving the current
>>> code base or re-write, whichever seems more appropriate)?
>>
>> I don't think people are opposed to having a PyPI clone on GAE,
>> but moving the existing installation to GAE is something we would
>> have to discuss separately.
>>
>> I for one would not welcome such a change, since we then completely
>> lose control over service availability.
>>
> 
> I don't really understand what this means.  Services become unavailable
> sometimes.  A computer breaks, a company shuts down, an agreement ends.  We
> don't necessarily have "control" over these situations, but we can respond
> to them.  If App Engine goes down and the App Engine team is all like
> "whatever, we'll get around to fixing stuff sometime" then sure it's a
> problem.  But it's not a plausible problem.  The plausible problem is that
> App Engine goes down, as it has from time to time, and we have to wait for
> them to figure out what's wrong and fix it.  *We* don't have to fix it, we
> only have to *wait for someone else to do it*.  I don't see any reason why
> *we* are any better at fixing issues than the App Engine team would be.
> Also presumably when there is a failure we want for the failure to be
> understood and avoided in the future.  The App Engine team does that.  And
> they do that *for us*.

I hear you, but don't agree that putting the runtime into the
hands of the GAE would get us an overall better service :-)

The point is that with GAE you only have control over the code
that you post there. Everything else is under control of the GAE
team (and their automatic administration systems), i.e. whether
your data is available and whether there are
proper backups, whether the site is reachable or not, whether
the performance is available and meets your requirements, whether
the service is accessible, fast enough and has low latency, etc.

So if something breaks, you can only fix it, if the problem
is caused by a bug in the code. For all other situations, you
have to wait for the GAE team to go in and do whatever is needed.

I'm not saying that the GAE team would be doing a poor job,
but just sitting there waiting for them to fix it in any
of the typical problem situations (apart from a bug in the
code), is asking a bit much, IMHO.

We have to find a middle ground, where we can still apply the
necessary hand holding ourselves, if we like to, while leaving
most of the day-to-day tasks to automatic tools or other service
providers to deal with.

Since PyPI is becoming a central piece of Python community
infrastructure, we need to make sure that we can provide a very
good uptime of the service and fast access to the data,
esp. for the automatic download tools.

Fortunately, those tools only use static data, so focusing on
making that highly available will get us a much better service
uptime with little extra effort.

> In some catastrophic case we could move the site to another server, use
> TyphoonAE to move the code over (or simply require that there is a
> sufficient abstraction layer to allow for a more normal environment) and
> bring the site up.  We control the domain, we can ultimately control where
> it is hosted.  This kind of failure seems like it would be far more likely
> given our current situation than on App Engine, but moving to App Engine
> would not somehow make this kind of move impossible.

True, but do you really want to go through all that trouble
just because GAE is down or too slow to be usable again ?

If we were to go for a cloud service to deploy the PyPI runtime, I'd
much rather like to see a standard virtualized server approach
being used.

With that approach, moving (virtual) servers would take
at most 5 minutes, if needed at all - you can rather easily setup
virtual servers as high availability cluster and then have
them manage the failover all by themselves.

BTW: Here's a nice blog on the subject of downtimes:
http://www.transparentuptime.com/

>> Someone would also have to do some math to calculate the monthly
>> costs for the PSF:
>>
>>    http://code.google.com/appengine/docs/quotas.html
>>    http://code.google.com/appengine/docs/billing.html
>>    http://code.google.com/appengine/business/
>>
> 
> It seems unlikely we'd have to pay for the service.

Perhaps, but then someone will have to get that information as well.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 25 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                23 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From marrakis at gmail.com  Fri Jun 25 10:44:39 2010
From: marrakis at gmail.com (Mathieu Leduc-Hamel)
Date: Fri, 25 Jun 2010 10:44:39 +0200
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <D0684B0E-2117-45BB-A498-E61E385B52BE@coderanger.net>
References: <AANLkTinmWpcE7Ko66gaEI_iPm3rqwKVA7LKFAV5sWkPJ@mail.gmail.com>
	<loom.20100625T091650-630@post.gmane.org>
	<CEA37824-3DA6-4828-908E-39C85F8F6099@coderanger.net>
	<loom.20100625T094453-63@post.gmane.org>
	<D0684B0E-2117-45BB-A498-E61E385B52BE@coderanger.net>
Message-ID: <AANLkTinyA_VWKBVv0YVcMo6YrDW5DskRvMrrOJs6QQXZ@mail.gmail.com>

>
> Its mostly a question of ongoing management. Apache+Linux+$SQLSERVER+etc
> can certainly handle our needs (which, lets face it, aren't really that
> complex), but we don't have a full-time management staff for our server. By
> leaning on Google (or Amazon, Rackspace, etc) we don't have to worry about
> the day-to-day details of running the site. How many of the recent PyPI
> downtimes have just required bouncing Apache? Wouldn't it have been nice if
> a site engineer got paged within 60 seconds and had it dealt with soon after
> instead of having to wait for one of the PyPI volunteers to notice and get
> to a computer? It isn't a question of capability, it is just where are our
> man-hours best spent: simple maintenance or actually improving the site?
>
>
True GAE will allow us to have a good cloud implentation but:

- Right now the problem we faced with PyPI is not necessarily related to the
server or the type of deployment. We concentrated the discussion on the type
of server or which platform but we completely forgot to think about if the
code is working !

- Maybe switching to something else will just make PyPI to restart more
frequently.

It's the first law of optimization we need to find were the problem came
from !
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100625/cb055c14/attachment.html>

From ziade.tarek at gmail.com  Fri Jun 25 11:21:56 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Fri, 25 Jun 2010 11:21:56 +0200
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <AANLkTinmWpcE7Ko66gaEI_iPm3rqwKVA7LKFAV5sWkPJ@mail.gmail.com>
References: <AANLkTinmWpcE7Ko66gaEI_iPm3rqwKVA7LKFAV5sWkPJ@mail.gmail.com>
Message-ID: <AANLkTiniv_6jRmsIbJ__YeYHooElNcVGoACCPiv4wGKK@mail.gmail.com>

On Thu, Jun 24, 2010 at 11:24 PM, Almir Karic <almir at almirkaric.com> wrote:
> i would like to help out with the move.
>
> is anyone actually opposed to moving to GAE (either moving the current
> code base or re-write, whichever seems more appropriate)?

Could you summarize the motivations for such a move ?

ISTM that the problem is more about the management of PyPI, rather
than its code.

Here's my summary:

1 - PyPI was down less than 2 days in 365 days IIRC. PyPI lacks of
sysadmins, we need more in several timezone. A sysadmin just relaunch
the service, like MvL or Jannis did.

2 - Some people in the community are frustrated with the current
process of getting a feature in PyPI. I don't have a strong opinion in
this but I think having the code in a DVCS would be better.
hg.python.org would open the codebase to all python core comitters,
and people would be able to request pulls.

Sure, writing something in GSOC would be fun, but if we want to
address the real problems, its not in the code field imo.

Regards
Tarek

-- 
Tarek Ziad? | http://ziade.org

From jcea at jcea.es  Fri Jun 25 12:36:37 2010
From: jcea at jcea.es (Jesus Cea)
Date: Fri, 25 Jun 2010 12:36:37 +0200
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <AANLkTiniv_6jRmsIbJ__YeYHooElNcVGoACCPiv4wGKK@mail.gmail.com>
References: <AANLkTinmWpcE7Ko66gaEI_iPm3rqwKVA7LKFAV5sWkPJ@mail.gmail.com>
	<AANLkTiniv_6jRmsIbJ__YeYHooElNcVGoACCPiv4wGKK@mail.gmail.com>
Message-ID: <4C2486B5.8090904@jcea.es>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 25/06/10 11:21, Tarek Ziad? wrote:
> 1 - PyPI was down less than 2 days in 365 days IIRC. PyPI lacks of
> sysadmins, we need more in several timezone. A sysadmin just relaunch
> the service, like MvL or Jannis did.

We need to deploy the mirroring PEP, and the impact of PYPI central
server downtime would be far less. The only missing point in the PEP,
AFAIK, is the crypto stuff to prevent mirror missbehaviour.

> 2 - Some people in the community are frustrated with the current
> process of getting a feature in PyPI. I don't have a strong opinion in
> this but I think having the code in a DVCS would be better.
> hg.python.org would open the codebase to all python core comitters,
> and people would be able to request pulls.

+1.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at jcea.es - http://www.jcea.es/     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at jabber.org         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBTCSGtZlgi5GaxT1NAQLlUQP7BFVYQwcmic3Zu93yg5S1TD8YSsGm7YmM
X8RbKTt/1rE9cc1h53goFjx8r75PsjqpF16f2jARQjipEi+2066wS2pqflERVMKO
XnPi5UOz9M5oOe0MfvZKytMx+aMowcjVXhhE8tka9WE3qVZ0feZNEcqAE60nwx3h
j8hI2gpBKfE=
=uTHa
-----END PGP SIGNATURE-----

From ianb at colorstudy.com  Fri Jun 25 18:49:16 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Fri, 25 Jun 2010 11:49:16 -0500
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <4C246B51.9010700@egenix.com>
References: <AANLkTinmWpcE7Ko66gaEI_iPm3rqwKVA7LKFAV5sWkPJ@mail.gmail.com> 
	<4C23D92F.5060105@egenix.com>
	<AANLkTikMBLMcFRgVYc-sIcvSQapR8H96xrCf-Xma1Aq3@mail.gmail.com> 
	<4C246B51.9010700@egenix.com>
Message-ID: <AANLkTimXA3S891EZy_MeKtoasmRzy3OqJ7EwbSRfAfvt@mail.gmail.com>

On Fri, Jun 25, 2010 at 3:39 AM, M.-A. Lemburg <mal at egenix.com> wrote:

> Ian Bicking wrote:
> > On Thu, Jun 24, 2010 at 5:16 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> >
> >> Almir Karic wrote:
> >>> i would like to help out with the move.
> >>>
> >>> is anyone actually opposed to moving to GAE (either moving the current
> >>> code base or re-write, whichever seems more appropriate)?
> >>
> >> I don't think people are opposed to having a PyPI clone on GAE,
> >> but moving the existing installation to GAE is something we would
> >> have to discuss separately.
> >>
> >> I for one would not welcome such a change, since we then completely
> >> lose control over service availability.
> >>
> >
> > I don't really understand what this means.  Services become unavailable
> > sometimes.  A computer breaks, a company shuts down, an agreement ends.
>  We
> > don't necessarily have "control" over these situations, but we can
> respond
> > to them.  If App Engine goes down and the App Engine team is all like
> > "whatever, we'll get around to fixing stuff sometime" then sure it's a
> > problem.  But it's not a plausible problem.  The plausible problem is
> that
> > App Engine goes down, as it has from time to time, and we have to wait
> for
> > them to figure out what's wrong and fix it.  *We* don't have to fix it,
> we
> > only have to *wait for someone else to do it*.  I don't see any reason
> why
> > *we* are any better at fixing issues than the App Engine team would be.
> > Also presumably when there is a failure we want for the failure to be
> > understood and avoided in the future.  The App Engine team does that.
>  And
> > they do that *for us*.
>
> I hear you, but don't agree that putting the runtime into the
> hands of the GAE would get us an overall better service :-)
>
> The point is that with GAE you only have control over the code
> that you post there. Everything else is under control of the GAE
> team (and their automatic administration systems), i.e. whether
> your data is available and whether there are
> proper backups, whether the site is reachable or not, whether
> the performance is available and meets your requirements, whether
> the service is accessible, fast enough and has low latency, etc.
>
> So if something breaks, you can only fix it, if the problem
> is caused by a bug in the code. For all other situations, you
> have to wait for the GAE team to go in and do whatever is needed.
>
> I'm not saying that the GAE team would be doing a poor job,
> but just sitting there waiting for them to fix it in any
> of the typical problem situations (apart from a bug in the
> code), is asking a bit much, IMHO.
>

If GAE was just another hosting system, then sure -- but it's not.  For
instance, Noah mentioned if Apache went down (or the equivalent) there's
someone with a pager who will respond to it.  Except GAE isn't actually like
that; application instances are can be automatically killed, machines are
monitored automatically and brought out of the pool as necessary.  We're not
replacing our diligence with Google employees, it would be replaced with
machines.

Of course there might be network problems or Google's own problems growing
the service.  But a substantial class of problems (problems that I believe
have actually caused downtime) are simply eliminated from the system.  GAE
has less serviceable parts; that appears like losing control but it's really
the normal progression away from manual interactions.  I would really like
if there was an open source alternative that provided that kind of
infrastructure, but there isn't.

Another advantage to GAE is that if there are application errors, it would
be much easier for anyone to work on them -- anyone can sign up and receive
a free GAE account and deploy the code with almost no effort, and they will
be hosting that is completely equivalent to anyone else's hosting.  The only
difference would be the data set, and it is possible (maybe even likely)
that some class of problems will only be noticeable with a full dataset.
That's true now as well, like for some UI problems where pages have become
unwieldy, and I think it would be really helpful (regardless of GAE) if PyPI
had a cleaned-up-export built into it.

Other cloud service providers provide something very different from GAE, and
I don't think they would give a lot of benefit.  The one advantage I see is
that we (well, anyone) could spin up a new instance in a consistent state.
Everything else is basically the same, including all the same management
issues -- there's no one to kick Apache except us, for instance.  Honestly
if I have any skin in the game it's actually for a system like this, as I've
been working on this sort of infrastructure (http://cloudsilverlining.org)
-- I only propose GAE because I genuinely think it will work best for a
volunteer-run piece of infrastructure like PyPI.

We have to find a middle ground, where we can still apply the
> necessary hand holding ourselves, if we like to, while leaving
> most of the day-to-day tasks to automatic tools or other service
> providers to deal with.
>
> Since PyPI is becoming a central piece of Python community
> infrastructure, we need to make sure that we can provide a very
> good uptime of the service and fast access to the data,
> esp. for the automatic download tools.
>
> Fortunately, those tools only use static data, so focusing on
> making that highly available will get us a much better service
> uptime with little extra effort.
>
> > In some catastrophic case we could move the site to another server, use
> > TyphoonAE to move the code over (or simply require that there is a
> > sufficient abstraction layer to allow for a more normal environment) and
> > bring the site up.  We control the domain, we can ultimately control
> where
> > it is hosted.  This kind of failure seems like it would be far more
> likely
> > given our current situation than on App Engine, but moving to App Engine
> > would not somehow make this kind of move impossible.
>
> True, but do you really want to go through all that trouble
> just because GAE is down or too slow to be usable again ?
>

That's the catastrophic case, where Google decides they don't care about App
Engine or something like that.  Right now we'd have to do the same thing if
the server's hard disk dies, which is obviously far more likely.

If we were to go for a cloud service to deploy the PyPI runtime, I'd
> much rather like to see a standard virtualized server approach
> being used.
>
> With that approach, moving (virtual) servers would take
> at most 5 minutes, if needed at all - you can rather easily setup
> virtual servers as high availability cluster and then have
> them manage the failover all by themselves.
>

Setting up infrastructure for fail-overs is hard, and it would be easy for
us to set it up for the wrong pieces (the ones that aren't breaking).  In
some sense this is why I'm not excited about mirroring, because I don't
think it's fail-over for the pieces likely to break.

I do like the static file proposal, also.  I think just putting more content
into static files could potentially fix most of our problems, along with
maybe a bit of server tweaking (to make sure even if PyPI goes down, it
doesn't take Apache and the static files with it).  I think using a CDN
would be a nice step for speed, but is less important for reliability; I
think generating things with a cron job will reduce reliability because it's
exactly the kind of behind-the-scenes machinery that could break without
someone noticing, and we don't have a dedicated staff paying attention to
things like that.  If a new package registration breaks, I'd far rather it
be rejected immediately (e.g., from setup.py register) than for a broken
cron job to keep it from getting in the simple index.

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100625/ae089e9b/attachment.html>

From tjreedy at udel.edu  Fri Jun 25 18:54:41 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 25 Jun 2010 12:54:41 -0400
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <D0684B0E-2117-45BB-A498-E61E385B52BE@coderanger.net>
References: <AANLkTinmWpcE7Ko66gaEI_iPm3rqwKVA7LKFAV5sWkPJ@mail.gmail.com>	<loom.20100625T091650-630@post.gmane.org>	<CEA37824-3DA6-4828-908E-39C85F8F6099@coderanger.net>	<loom.20100625T094453-63@post.gmane.org>
	<D0684B0E-2117-45BB-A498-E61E385B52BE@coderanger.net>
Message-ID: <i02n0i$1ao$1@dough.gmane.org>

There are obviously objections, valid or not, to moving PyPI lock, 
stock, and barrel to GAE, and perhaps any 1 proprietary setting without 
any direct experience with GAE.

There should be no objection to a GAE mirror provided by one or more 
people with GAE knowledge and enthusiasm. It could be run up to the free 
limits. If it hit them, the operators could try to get a special case 
upgrade. "We should be able to get ..." does not cut it.

Operators of a GAE could make it a mirror-plus if they wanted. Perhaps 
try an alternate (competitive) search page that would make it easier to 
find packages that run on a specific Python version or versions.

After a year of GAE experience, making it a prime locus might -- or 
might not -- be more sensible.

-- 
Terry Jan Reedy


From martin at v.loewis.de  Fri Jun 25 22:19:37 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 25 Jun 2010 22:19:37 +0200
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <D0684B0E-2117-45BB-A498-E61E385B52BE@coderanger.net>
References: <AANLkTinmWpcE7Ko66gaEI_iPm3rqwKVA7LKFAV5sWkPJ@mail.gmail.com>	<loom.20100625T091650-630@post.gmane.org>	<CEA37824-3DA6-4828-908E-39C85F8F6099@coderanger.net>	<loom.20100625T094453-63@post.gmane.org>
	<D0684B0E-2117-45BB-A498-E61E385B52BE@coderanger.net>
Message-ID: <4C250F59.9050109@v.loewis.de>

> Its mostly a question of ongoing management.
> Apache+Linux+$SQLSERVER+etc can certainly handle our needs (which,
> lets face it, aren't really that complex), but we don't have a
> full-time management staff for our server. By leaning on Google (or
> Amazon, Rackspace, etc) we don't have to worry about the day-to-day
> details of running the site. How many of the recent PyPI downtimes
> have just required bouncing Apache? Wouldn't it have been nice if a
> site engineer got paged within 60 seconds and had it dealt with soon
> after instead of having to wait for one of the PyPI volunteers to
> notice and get to a computer? It isn't a question of capability, it
> is just where are our man-hours best spent: simple maintenance or
> actually improving the site?

Still, there is significant, fundamental opposition to binding PyPI to
any vendor tightly in terms of implementation. This applies to GAE,
and (probably less strongly) to S3. I believe that Antoine just voices
a wide-spread concern (rather than him representing a singular opinion).

Therefore, I will personally refrain from endorsing any port of PyPI to
GAE. If people think it would be worthwhile, they could still start a
port; if they wanted that port to become pypi.python.org eventually,
they'd have to convince Richard Jones, me, or the PSF board. I know that
without an advanced prototype, I won't be convinced.

Regards,
Martin

From martin at v.loewis.de  Fri Jun 25 22:24:08 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Fri, 25 Jun 2010 22:24:08 +0200
Subject: [Catalog-sig] Rewrite PyPI for App Engine?
In-Reply-To: <AANLkTimXA3S891EZy_MeKtoasmRzy3OqJ7EwbSRfAfvt@mail.gmail.com>
References: <AANLkTinmWpcE7Ko66gaEI_iPm3rqwKVA7LKFAV5sWkPJ@mail.gmail.com>
	<4C23D92F.5060105@egenix.com>	<AANLkTikMBLMcFRgVYc-sIcvSQapR8H96xrCf-Xma1Aq3@mail.gmail.com>
	<4C246B51.9010700@egenix.com>
	<AANLkTimXA3S891EZy_MeKtoasmRzy3OqJ7EwbSRfAfvt@mail.gmail.com>
Message-ID: <4C251068.5030804@v.loewis.de>

Am 25.06.2010 18:49, schrieb Ian Bicking:
> That's true now as well, like for some
> UI problems where pages have become unwieldy, and I think it would be
> really helpful (regardless of GAE) if PyPI had a cleaned-up-export built
> into it.

Not sure what you mean by that. pg_dump seems to work fine.

Regards,
Martin

From mal at egenix.com  Tue Jun 29 16:39:54 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 29 Jun 2010 16:39:54 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability (version 2)
Message-ID: <4C2A05BA.5050808@egenix.com>

After the discussions, we've had on the catalog sig, I have updated
the proposal to include comments and clarifications regarding the setup
and it's relationship to the mirror PEP (see the end of the proposal).

While I don't think that the proposal has an influence on whether
or when PEP 381 gets rolled out or not, I will delay the PSF board
vote on the proposal until the August board meeting. Perhaps that
will even encourage developers to put more time on PEP 381.

Regarding the costs of the cloud idea, I think this would actually
be a good way of getting more donations for the PSF - possibly
even with a net win. It's one of the few visible and tangible
things the PSF has to offer to the community.

Overall, I think this is a net win for everybody: users, developers
and the PSF.

"""
PSF-Proposal: 100
Title: Move PyPI static data to the cloud for better availability
Version: 2
Last-Modified: 2010-06-29
Author: mal at lemburg.com (Marc-Andr? Lemburg)
Discussions-To: catalog-sig at python.org
Status: Draft
Type: Informational
Created: 2010-06-14
Post-History:


Proposal: Move PyPI static data to the cloud for better availability
========================================================================

Motivation
----------

PyPI has in recent months seen several outages with the index not
being unavailable to both users using the web GUI interface as well as
package administration tools such as easy_install from setuptools.

As more and more Python applications rely on tools such as
easy_install for direct installation, or zc.buildout to manage the
complete software configuration cycle, the PyPI infrastructure
receives more and more attention from the Python community.

While we don't have hard numbers available (there doesn't appear to be
any monitoring in place), the number of discussions about PyPI
outtages in the mailing lists has increased to a point where we cannot
simply ignore those complaints anymore.

In order to maintain its credibility as software repository, to
support the many different projects relying on the PyPI infrastructure
and the many users who rely on the simplified installation process
enabled by PyPI, the PSF needs to take action and move the essential
parts of PyPI to a more robust infrastructur that provides:

 * scalability
 * 24/7 outsourced system administration management
 * redundant storage
 * geo-localized fast and reliable access


Current Situation
-----------------

PyPI is currently run from a single server hosted in The Netherlands
(ximinez.python.org).  This server is run by a very small team of sys
admin.

PyPI itself has in recent months been mostly maintained by one
developer: Martin von Loewis.

Projects are underway to enhance PyPI in various ways, including a
proposal to add external mirroring (PEP 381), but these are still a
long way from being finalized and implemented in the existing client
tools.

According to Martin, the server side features of PEP 381, including a
few undocumented extensions to provide package signatures, are already
implemented.

However, without client tools to make use of them, this is not going
to change the current situation for existing PyPI users.

Furthermore those client tools enhancements would first have to get
adopted by PyPI users by either replacing their client tools with
updated versions or switching to new client tools, which is likely
going to take months to years. Existing client tool users won't see an
immediate improvement.


Usage
-----

PyPI provides four different mechanisms for accessing the stored
information:

 * a web GUI that is meant for use by humans
 * an RPC interface which is mostly used for uploading new
   content
 * a semi-static /simple package listing, used by setuptools
 * a static area /packages for package download files and
   documentation, used by both the web GUI and setuptools

The /simple package listing is dump of all packages in PyPI using a
simple HTML page with links to sub-pages for each package. These
sub-pages provide links to download files and external references.

External tools like easy_install only use the /simple package
listing together with the hosted package download files.

While the /simple package listing is currently dynamically created
from the database in real-time, this is not really needed for normal
operation. A static copy created every 10-20 minutes would provide the
same level of service in much the same way.


Moving static data to a CDN
---------------------------

Under the proposal the static information stored in PyPI
(meta-information as well as package download files and documentation)
is moved to a content delivery network (CDN).

For this purpose, the /simple package listing is replaced with a
static copy that is recreated every 10-20 minutes using a cronjob on
the PyPI server.

At the same intervals, another script will scan the package and
documentation files under /packages for updates and upload any changes
to the CDN for neartime availability.

By using a CDN the PSF will enable and provide:

 * high availability of the static PyPI content
 * offload management to the CDN
 * enable geo-localized downloads, i.e. the files are hosted
   on a nearby server
 * faster downloads
 * more reliability and scalability
 * move away from a single point of failure setup

Note that the proposal does not cover distribution of the dynamic
parts of PyPI. As a result uploads to PyPI may still fail if the PyPI
server goes down. However, these dynamic parts are currently not being
used by the existing package installation tools.


Choice of CDN: Amazon Cloudfront
--------------------------------

To keep the costs low for the PSF, Amazon Cloudfront appears to be
the bext choice for CDN.

Cloudfront is supported by a set of Python libraries (e.g. Amazon S3
lib and boto), upload scripts are readily available and can easily be
customized.

 http://www.saltycrane.com/blog/2008/12/card-store-project-4-notes-using-amazons-cloudfront/

Other CDNs, such as Akamai, are either more expensive or require
custom integration.  Availability of Python-based tools is not always
given, in fact, accessing such information is difficult for most of
the proporietary CDNs.


Cloudfront: quality of service
------------------------------

Amazon Cloudfront uses S3 as basis for the service, S3 has been around
for years and has a very stable uptime:

 http://www.readwriteweb.com/archives/amazon_s3_exceeds_9999_percent_uptime.php

Cloudfront itself has been around since Nov 2008. Amazon still uses
the web 2.0 "beta" marketing term on it.

You can check their current online status using this panel:

 http://status.aws.amazon.com/

Apart from the gained availability and outsourced management, we'd
also get faster downloads in most parts of the world, due to the local
caching Cloudfront is applying. This caching can be used to further
increase the availability, since we can control the expiry time of
those local copies.

So in summary, we are replacing a single point of failure with an N
server fail-over system (with N being the number of edge caching
servers they use).


How Cloudfront works
--------------------

Cloudfront uses Amazon's S3 storage system which is based on
"buckets".  These can store any number of files in a directory-like
structure. The only limit is a 5GB per file limit - more than enough
for any PyPI package file.

Cloudfront provides a domain for each registered S3 bucket via a
"distribution" which is then made available through local cache
servers in various locations around the world. The management of which
server to use for an incoming request is transparently handled by
Amazon. Once uploaded to the S3 bucket, the files will be distributed
to the cache servers on demand and as necessary.

Each edge server server maintains a cache of requested files and
refetches the files after an expiry time which can be defined when
uploading the file to the bucket.

To simplify things on our side, we'll setup a CNAME DNS alias
for the Cloudfront domain issued by Amazon to our bucket:

 pypi-static.python.org. IN CNAME d32z1yuk7jeryy.cloudfront.net.

In the unlikely event of a longer downtime of the whole Amazon
Cloudfront system, our system administrators could then easily change
the DNS alias pypi-static.python.org to point back to the PyPI server
until the Cloudfront problem is rectified.

For more details, please see the Cloudfront documentation and FAQ:

 http://aws.amazon.com/documentation/cloudfront/
 http://aws.amazon.com/cloudfront/faqs/


Integration
-----------

In order to keep the number of changes to existing client side tools
and PyPI itself to a minimum, the installation will try to be as
transparent to both the server and the client side as possible.

This requires on the server side:

 * few, if any changes to the PyPI code base
 * simple scripts, driven by cronjobs
 * a simple distributed redirection setup to avoid having
   to change client side tools

On the client side:

 * no need to change the existing URL http://pypi.python.org/simple
   to access PyPI
 * redirects are already supported by setuptools via urllib2

Note that we are avoiding creating a lock-in situation by moving the
data to a CDN, since the needed configuration changes on the server
side can easily be rolled back to the current setup, without affecting
the client side.


Server side: upload cronjobs
----------------------------

Since the /simple index tree is currently being created dynamically,
we'd need to create static copies of it at regular intervals in order
to upload the content to the S3 bucket. This can easily be done using
tools such as wget or curl or using a custom Python script that hooks
directly into the PyPI database (and reuses the code for generating
the /simple tree).

Both the static copy of the /simple tree and the static files uploaded
to /packages then need to be uploaded or updated in the S3 bucket by a
cronjob running every 10-20 minutes.

In a second phase of the project, we could extend PyPI to
automatically push updates to Cloudfront whenever a new file is
uploaded or the package data changes.


Server side: downloads statistics
---------------------------------

The next step would then be to configure access logs:

 http://docs.amazonwebservices.com/AmazonCloudFront/latest/DeveloperGuide/index.html?AccessLogs.html

and add a cronjob to download them to the PyPI server.

Since the format is a bit different than the Apache log format used by
the PyPI software, we'd have two options:

 1. convert the Cloudfront format to Apache format and simply
    append the converted logs to the local log files

 2. write a Cloudfront log file reader and add it to the
    apache_count_dist.py script that updates the download
    counts on the web GUI

Both options require no more than a few hours to implement and test.


Server side: redirection setup
------------------------------

Since PyPI wasn't designed to be put on a CDN, it mixes static file
URL paths with dynamic access ones, e.g.

dynamic:

 http://pypi.python.org/pypi
 (and a few others)

static:

 http://pypi.python.org/simple
 http://pypi.python.org/packages

To move part of the URL path tree to a CDN, which works based on
domains, we will need to provide a URL redirection setup that
redirects client side tools to the new location.

As Martin von Loewis mentioned, this will require distributing the
redirection setup to more than just one server as well.

Fortunately, this is not difficult to do: it requires a preconfigured
lighttpd (*) setup running on N different servers which then all
provide the necessary redirections (and nothing more):

dynamic:

 http://pypi.python.org/ -> http://ximinez.python.org/pypi
 http://pypi.python.org/pypi -> http://ximinez.python.org/pypi
 (and possibly a few others)

static:

 http://pypi.python.org/simple -> http://pypi-static.python.org/simple
 http://pypi.python.org/packages -> http://pypi-static.python.org/packages
 (note: pypi-static.python.org is a CNAME alias for the Cloudfront
  domain issued to the S3 bucket where we upload the data)

The pypi.python.org domain would then have to be setup to map to
multiple IP addresses via DNS round-robin, one entry for each
redirection server, e.g.

 pypi.python.org. IN A 123.123.123.1
 pypi.python.org. IN A 123.123.123.2
 pypi.python.org. IN A 123.123.123.3
 pypi.python.org. IN A 123.123.123.4

Redirection servers could be run on all PSF server machines, and, to
increase availability, on PSF partner servers as well.

It should be noted that current client side PyPI tools do not support
automatic retry, so there still is a chance that the redirection
server they pick on first try will fail. The user would then just have
to retry the download to get a new server address. Automatic retry
would, of course, create a better user experience, but this requires
a few small changes in the existing PyPI client tools.


(*) lighttpd is a lightwheight and fast HTTP server. It's easy to
setup, doesn't require a lot of resources on the server machine and
runs stable.


Long-term changes
-----------------

While enabling the above redirection setup, we should also start
working on changing PyPI and the client tools to use two new domains
which then cleanly separate the static CDN file access from the
dynamic PyPI server access:

 pypi.python.org
 pypi-static.python.org

Such a transition on the client side is expected to take at least a
few years. After that, the redirection service can be shut down or
used to distribute and scale the dynamic PyPI service parts.


Future improvements
-------------------

We could replace the cronjob system with a trigger based system
that uploads changes as soon as the PyPI server receives them.


Side-effects
------------

Restarts of the PyPI server, network outages, or hardware failures
would not affect the static copies of the PyPI on the CDN. setuptools,
easy_install, pip, zc.buildout, etc. would continue to work.

The S3 bucket would serve as additional backup for the files on PyPI.

Later integration with Amazon EC2 (their virtual server offering)
would easily be possible for more scalability and reduced system
administration load.

We don't have to worry about issues such as mirror servers having
out-of-date data. Manipulation of packages, e.g. to introduce trojans,
is also minimized, since the Cloudfront edge servers get their data
straight from the S3 bucket.


Costs
-----

Amazon charges for S3 and Cloudfront storage, transfer and access. The
costs vary depending on location.

 http://aws.amazon.com/cloudfront/#pricing
 http://aws.amazon.com/s3/#pricing

To get an idea of the costs, we'd have to take a closer look at
the PyPI web stats:

 http://pypi.python.org/webstats/usage_201005.html

In May 2010, PyPI transferred 819GB data and had to handle 22mio
requests.

Using the AWS monthly calculator this gives roughly (I used 37KB as
average object size and 35% US, 35% EU, 10% HK, 10% JP as basis): USD
132 per month, or about USD 1,584 per year for Cloudfront.

For the S3 storage, the costs amount to roughly USD 30 per month, or
USD 360 per year (100GB storage, 50GB traffic in, 100GB traffic
out, 1000 PUT requests, 1mio GET requests).

Total costs are an estimated USD 1944 per year.


Refinancing the costs
---------------------

Since PyPI is being used as essential resource by many important
Python projects (Zope, Plone, Django, etc.), it's fair to ask the
respective foundations and the general Python community for donations
to help refinance the administration costs.

A prominent donation button should go the PyPI page with a text
explaining how PyPI is being hosted and why donations are necessary.

We may also be able to directly ask for donations from the above
foundations. Details of this are currently being evaluated by the PSF
board (there are some issues related to our non-profit status that
make this more complicated than it appears at first).

Unlike other less visible PSF activities, providing and running PyPI
is a real tangible service to the community, creating more incentive
for Python users, including companies relying on the PyPI service, to
donate to the PSF.

Overall, we should be able to refinance the costs of this improved
service level, perhaps even generate more donations than needed to
fund other PSF activities.


Effort
------

Given that most of the tools are readily available, setting up the
servers shouldn't take more than 2-3 developer days for developers
who've worked with Amazon S3 and Cloudfront before, including testing.

It is expected that we'll find volunteers to implement the necessary
changes.


Competing with PEP 381
----------------------

A few PEP 381 developers have stated that this proposal would limit
the interest in PEP 381 implementations and argue that the proposal
would compete with their proposed strategy.

Just to clarify, this proposal does not try to compete with the mirror
proposal outlined in PEP 381. Instead it focuses on a readily
available solution that can be implemented in a few days and only
requires little additional system administration.

In order to further underline this, the proposal will be presented to
the board for approval in their August board meeting (currently
scheduled for August 16), giving the PEP 381 developers more time to
work and improve their PEP 381 client implementations.

If the PEP 381 infrastructure gets rolled out, both the external
mirrors and the cloud mirrors can work side-by-side, so there is no
conflict.

"""

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 15 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                33 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/
_______________________________________________
Catalog-SIG mailing list
Catalog-SIG at python.org
http://mail.python.org/mailman/listinfo/catalog-sig

From ziade.tarek at gmail.com  Tue Jun 29 17:05:53 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Tue, 29 Jun 2010 17:05:53 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability (version 2)
In-Reply-To: <4C2A05BA.5050808@egenix.com>
References: <4C2A05BA.5050808@egenix.com>
Message-ID: <AANLkTik0ElfHG3t0R3IalvozNzRRRYnwDkec5tZ4yLNi@mail.gmail.com>

On Tue, Jun 29, 2010 at 4:39 PM, M.-A. Lemburg <mal at egenix.com> wrote:
[..]
> Competing with PEP 381
> ----------------------
>
> A few PEP 381 developers have stated that this proposal would limit
> the interest in PEP 381 implementations and argue that the proposal
> would compete with their proposed strategy.

You can replace a "few" with the "PEP 381 authors" here.

>
> Just to clarify, this proposal does not try to compete with the mirror
> proposal outlined in PEP 381. Instead it focuses on a readily
> available solution that can be implemented in a few days and only
> requires little additional system administration.

I still disagree with this statement, it fully competes with PEP 381.

Your proposal and PEP 381 are both trying to solve the same issue.

In fact, I could copy-paste your "Motivation" section and put it in PEP 381 :)


> In order to further underline this, the proposal will be presented to
> the board for approval in their August board meeting (currently
> scheduled for August 16), giving the PEP 381 developers more time to
> work and improve their PEP 381 client implementations.

As I said earlier, the mirroring work was not finished because of a
lack of resource
and time. Giving us a deadline before you make your proposal is not
really helping.

That's like saying: "if you can finish your PEP 381 thing before
august, great. If not
we will implement the other proposal, but with the help and resources
provided by the PSF."

So you are just underlining that your solution is faster to implement here.

If you really want to compare both solutions, this section should
compare pro's and con's instead, and think of the best long term
solution for the community.

I still think that setting up a cloud doesn't solve anything, you will
still have to have a sysadmin behind a computer if something goes
wrong. And this will happen in the cloud as well, as I don't think
it's the silver bullet.

Furthermore, the outage where not as bad as you describe in your PEP.
You should give the real numbers in your document, and calculate the
availability percentage.
instead of "several outages". So far, PyPI is more reliable than
Twitter I think :)  (a few days / years)

Regards,
Tarek
-- 
Tarek Ziad? | http://ziade.org

From ianb at colorstudy.com  Tue Jun 29 18:54:03 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 29 Jun 2010 11:54:03 -0500
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability (version 2)
In-Reply-To: <4C2A05BA.5050808@egenix.com>
References: <4C2A05BA.5050808@egenix.com>
Message-ID: <AANLkTil9ez8UfQ9VjPInrTYw6p8NYX8ko4h1Y8QbjO4X@mail.gmail.com>

A few notes:

On Tue, Jun 29, 2010 at 9:39 AM, M.-A. Lemburg <mal at egenix.com> wrote:

> In order to maintain its credibility as software repository, to
> support the many different projects relying on the PyPI infrastructure
> and the many users who rely on the simplified installation process
> enabled by PyPI, the PSF needs to take action and move the essential
> parts of PyPI to a more robust infrastructur that provides:
>
>  * scalability
>  * 24/7 outsourced system administration management
>

In a sense a CDN offers outsourced system administration -- if you upload
content, they are responsible for making sure it gets served up.  But that's
all they do.

Other "cloud" systems only provide system administration for infrastructure
issues, like a network routing issue.  They do not provide anything on your
machine itself.  It is possible to get hosting with system administration
included, Rackspace Managed Servers are an example, but these are quite
expensive -- basically you are paying an overhead on hosting to have a
competent sysadmin on hand.

Usage
> -----
>
> PyPI provides four different mechanisms for accessing the stored
> information:
>
>  * a web GUI that is meant for use by humans
>  * an RPC interface which is mostly used for uploading new
>   content
>  * a semi-static /simple package listing, used by setuptools
>  * a static area /packages for package download files and
>   documentation, used by both the web GUI and setuptools
>

The static packages are used by the RPC (setup.py upload) and automatically
linked in.  There is no privileged aspect to them, Setuptools
(easy_install/pip) just reads the links provided, and if they happen to
point to pypi packages then that's what is fetched.  I mention this because
changing those URLs on the server side will be easy as a result.


> The /simple package listing is dump of all packages in PyPI using a
> simple HTML page with links to sub-pages for each package. These
> sub-pages provide links to download files and external references.
>
> External tools like easy_install only use the /simple package
> listing together with the hosted package download files.
>
> While the /simple package listing is currently dynamically created
> from the database in real-time, this is not really needed for normal
> operation. A static copy created every 10-20 minutes would provide the
> same level of service in much the same way.
>
>
> Moving static data to a CDN
> ---------------------------
>
> Under the proposal the static information stored in PyPI
> (meta-information as well as package download files and documentation)
> is moved to a content delivery network (CDN).
>
> For this purpose, the /simple package listing is replaced with a
> static copy that is recreated every 10-20 minutes using a cronjob on
> the PyPI server.
>
> At the same intervals, another script will scan the package and
> documentation files under /packages for updates and upload any changes
> to the CDN for neartime availability.
>

I disagree with this part of the proposal, because I think a 10-20 minute
delay introduces the possibility of invisible errors (an infinite delay),
and represents a real degradation of service as new versions of packages
will not be installable until after regeneration.  Also I think the RPC code
(what is invoked with setup.py register/upload) can regenerate these static
pages immediately.

Uploading to a CDN may have to be asynchronous, but to keep the data robust
we should really be storing the package locally and adding a new field to
point to the mirrored location (i.e., the CDN URL).  When the cron job runs
that field can be updated.  If the CDN upload fails (which is not unlikely)
then PyPI can keep using the local location.  The cron job would then also
be triggering another regeneration of the static file in /static, but so
long as you are only regenerating on changes this isn't much overhead.

Also, making upload/register a synchronous operation will slow down the
speed of RPC commands, but I don't think this is a problem -- I would much
rather have an upload be slow to finish than fast but not know when the
result will be available.  I don't know what kind of latency to expect,
really.


Also, I'd like to offer a counterproposal that does not use a CDN:

* Have PyPI write out static files *locally*
* Use rewrite rules so those files get served without touching PyPI.
* Move the PyPI installation to mod_wsgi (I believe it is using FCGI now?),
with conservative settings for things like MaxRequests.  I believe this will
significantly improve the problem of PyPI taking down Apache, which means
the static files will still be available even if PyPI itself is down.

This is largely work that would have to happen to move to a CDN, but it's
simpler (given how PyPI works now) and I believe will relieve most of the
problems we've seen.  PyPI right now is really quite reliable, these small
changes would I think be low-risk and less likely to introduce new problems
while addressing what I suspect is the source of problems.


-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100629/d415914b/attachment-0001.html>

From martin at v.loewis.de  Tue Jun 29 22:50:38 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Tue, 29 Jun 2010 22:50:38 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability (version 2)
In-Reply-To: <AANLkTil9ez8UfQ9VjPInrTYw6p8NYX8ko4h1Y8QbjO4X@mail.gmail.com>
References: <4C2A05BA.5050808@egenix.com>
	<AANLkTil9ez8UfQ9VjPInrTYw6p8NYX8ko4h1Y8QbjO4X@mail.gmail.com>
Message-ID: <4C2A5C9E.6000701@v.loewis.de>

> * Move the PyPI installation to mod_wsgi (I believe it is using FCGI
> now?)

For the latter: correct.

For the former (use mod_wsgi): I had actually implemented it, but needed
to revert to FCGI, because mod_wsgi would cause too many hanging servers.

> This is largely work that would have to happen to move to a CDN, but
> it's simpler (given how PyPI works now) and I believe will relieve most
> of the problems we've seen.

As for the switch to WSGI: it will *introduce* new problems.

> PyPI right now is really quite reliable,
> these small changes would I think be low-risk and less likely to
> introduce new problems while addressing what I suspect is the source of
> problems.

I disagree that these are small and low-risk. The WSGI switch will risk
stability; the others (generate static pages) will not be small, and
risk correctness.

Regards,
Martin

From ianb at colorstudy.com  Tue Jun 29 22:59:35 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 29 Jun 2010 15:59:35 -0500
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability (version 2)
In-Reply-To: <4C2A5C9E.6000701@v.loewis.de>
References: <4C2A05BA.5050808@egenix.com>
	<AANLkTil9ez8UfQ9VjPInrTYw6p8NYX8ko4h1Y8QbjO4X@mail.gmail.com> 
	<4C2A5C9E.6000701@v.loewis.de>
Message-ID: <AANLkTimu9FjvC5dEZ2BoJttHNhWXZloZfpHc-FPBVZrI@mail.gmail.com>

On Tue, Jun 29, 2010 at 3:50 PM, "Martin v. L?wis" <martin at v.loewis.de>wrote:

> > * Move the PyPI installation to mod_wsgi (I believe it is using FCGI
> > now?)
>
> For the latter: correct.
>
> For the former (use mod_wsgi): I had actually implemented it, but needed
> to revert to FCGI, because mod_wsgi would cause too many hanging servers.
>

I'm surprised, what specific mod_wsgi configuration did you try?  I've had
good luck with a using a daemon process and making sure no process lives too
long.  There's another configuration of mod_wsgi that runs Python in the
Apache process, which I've never used and doesn't seem like a good idea to
me.


> > This is largely work that would have to happen to move to a CDN, but
> > it's simpler (given how PyPI works now) and I believe will relieve most
> > of the problems we've seen.
>
> As for the switch to WSGI: it will *introduce* new problems.
>
> > PyPI right now is really quite reliable,
> > these small changes would I think be low-risk and less likely to
> > introduce new problems while addressing what I suspect is the source of
> > problems.
>
> I disagree that these are small and low-risk. The WSGI switch will risk
> stability; the others (generate static pages) will not be small, and
> risk correctness.
>

I don't really know how to describe "small" or "low-risk"... maybe I should
say "smaller" and "lesser-risk" than the full CDN proposal.

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100629/623c22b7/attachment.html>

From martin at v.loewis.de  Tue Jun 29 23:22:55 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Tue, 29 Jun 2010 23:22:55 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability (version 2)
In-Reply-To: <AANLkTimu9FjvC5dEZ2BoJttHNhWXZloZfpHc-FPBVZrI@mail.gmail.com>
References: <4C2A05BA.5050808@egenix.com>
	<AANLkTil9ez8UfQ9VjPInrTYw6p8NYX8ko4h1Y8QbjO4X@mail.gmail.com>
	<4C2A5C9E.6000701@v.loewis.de>
	<AANLkTimu9FjvC5dEZ2BoJttHNhWXZloZfpHc-FPBVZrI@mail.gmail.com>
Message-ID: <4C2A642F.8070605@v.loewis.de>

> I'm surprised, what specific mod_wsgi configuration did you try?

Not sure I understand the question:


WSGIDaemonProcess pypi display-name=wsgi-pypi processes=10 threads=1
maximum-requests=2000
WSGIProcessGroup pypi
WSGIPassAuthorization On
WSGIScriptAlias /pypi /data/pypi/src/pypi/pypi.wsgi
WSGIScriptAlias /simple /data/pypi/src/pypi/pypi.wsgi

According to the bzr log, I reverted that because Python would crash
(with a core dump).

>     > PyPI right now is really quite reliable,
>     > these small changes would I think be low-risk and less likely to
>     > introduce new problems while addressing what I suspect is the
>     source of
>     > problems.
> 
>     I disagree that these are small and low-risk. The WSGI switch will risk
>     stability; the others (generate static pages) will not be small, and
>     risk correctness.
> 
> 
> I don't really know how to describe "small" or "low-risk"... maybe I
> should say "smaller" and "lesser-risk" than the full CDN proposal.

Ah, ok - relatively speaking.

That is certainly true: the CDN proposal has more risk to not work
correctly.

Regards,
Martin

From ianb at colorstudy.com  Tue Jun 29 23:39:11 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 29 Jun 2010 16:39:11 -0500
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
	better availability (version 2)
In-Reply-To: <4C2A642F.8070605@v.loewis.de>
References: <4C2A05BA.5050808@egenix.com>
	<AANLkTil9ez8UfQ9VjPInrTYw6p8NYX8ko4h1Y8QbjO4X@mail.gmail.com> 
	<4C2A5C9E.6000701@v.loewis.de>
	<AANLkTimu9FjvC5dEZ2BoJttHNhWXZloZfpHc-FPBVZrI@mail.gmail.com> 
	<4C2A642F.8070605@v.loewis.de>
Message-ID: <AANLkTikwP7fFT-a7usG1VBY3incbSA2_PkRtboLO6p3d@mail.gmail.com>

On Tue, Jun 29, 2010 at 4:22 PM, "Martin v. L?wis" <martin at v.loewis.de>wrote:

> > I'm surprised, what specific mod_wsgi configuration did you try?
>
> Not sure I understand the question:
>
>
> WSGIDaemonProcess pypi display-name=wsgi-pypi processes=10 threads=1
> maximum-requests=2000
> WSGIProcessGroup pypi
> WSGIPassAuthorization On
> WSGIScriptAlias /pypi /data/pypi/src/pypi/pypi.wsgi
> WSGIScriptAlias /simple /data/pypi/src/pypi/pypi.wsgi
>
> According to the bzr log, I reverted that because Python would crash
> (with a core dump).
>

OK, that's how I would configure it too.  A core dump implies some
installation problem (e.g., mod_wsgi was compiled against one version of
Python, but is being bound to a different version -- or psycopg or some
other extension).  Graham Dumpleton is very responsive about these kinds of
issues with mod_wsgi if you mail the mod_wsgi list; I've always stuck to
debs and that's saved me from version mismatches, so I haven't actually
debugged issues like this myself.

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100629/dec8da62/attachment.html>

From martin at v.loewis.de  Wed Jun 30 06:50:13 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 30 Jun 2010 06:50:13 +0200
Subject: [Catalog-sig] Proposal: Move PyPI static data to the cloud for
 better availability (version 2)
In-Reply-To: <AANLkTikwP7fFT-a7usG1VBY3incbSA2_PkRtboLO6p3d@mail.gmail.com>
References: <4C2A05BA.5050808@egenix.com>
	<AANLkTil9ez8UfQ9VjPInrTYw6p8NYX8ko4h1Y8QbjO4X@mail.gmail.com>
	<4C2A5C9E.6000701@v.loewis.de>
	<AANLkTimu9FjvC5dEZ2BoJttHNhWXZloZfpHc-FPBVZrI@mail.gmail.com>
	<4C2A642F.8070605@v.loewis.de>
	<AANLkTikwP7fFT-a7usG1VBY3incbSA2_PkRtboLO6p3d@mail.gmail.com>
Message-ID: <4C2ACD05.8070402@v.loewis.de>

> OK, that's how I would configure it too.  A core dump implies some
> installation problem (e.g., mod_wsgi was compiled against one version of
> Python, but is being bound to a different version -- or psycopg or some
> other extension).  Graham Dumpleton is very responsive about these kinds
> of issues with mod_wsgi if you mail the mod_wsgi list; I've always stuck
> to debs and that's saved me from version mismatches, so I haven't
> actually debugged issues like this myself.

Same here: I was only using Debian packages for everything, expecting
that this ought to work. I don't feel like retrying at this point,
though, when all I expect is a loss of stability. If anybody absolutely
thinks that FCGI is unacceptable and really wants to have this work with
WSGI, please let me know.

Regards,
Martin