From dk.neugierig at gmail.com  Thu Jul  8 14:29:23 2010
From: dk.neugierig at gmail.com (Fred Freeley)
Date: Thu, 8 Jul 2010 08:29:23 -0400
Subject: [Catalog-sig] Serial/Parallel port libraries
Message-ID: <AANLkTimMq7qNR9VFUP0Ol0K0oVKz6_xEPJ1oUjkDBsSd@mail.gmail.com>

I would like to use Python to control serial port based instruments
directly, and not via a USB->Serial converter.

Are there any libraries out there that would provide that functionality in
Windows Vista/7/XP?

Thanks

-- 
Commit random ats of kindness and senseless beauty.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100708/1a39d17e/attachment.html>

From benji at benjiyork.com  Thu Jul  8 17:55:15 2010
From: benji at benjiyork.com (Benji York)
Date: Thu, 8 Jul 2010 11:55:15 -0400
Subject: [Catalog-sig] Serial/Parallel port libraries
In-Reply-To: <AANLkTimMq7qNR9VFUP0Ol0K0oVKz6_xEPJ1oUjkDBsSd@mail.gmail.com>
References: <AANLkTimMq7qNR9VFUP0Ol0K0oVKz6_xEPJ1oUjkDBsSd@mail.gmail.com>
Message-ID: <AANLkTilFS0tWunryn1CybVFDDWfjz_acA4QPsm6t43Mo@mail.gmail.com>

On Thu, Jul 8, 2010 at 8:29 AM, Fred Freeley <dk.neugierig at gmail.com> wrote:
> I would like to use Python to control serial port based instruments
> directly, and not via a USB->Serial converter.
> Are there any libraries out there that would provide that functionality in
> Windows Vista/7/XP?

Try searching PyPI:
http://pypi.python.org/pypi?%3Aaction=search&term=serial+port&submit=search
-- 
Benji York

From merwok at netwok.org  Wed Jul 14 09:36:19 2010
From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Wed, 14 Jul 2010 09:36:19 +0200
Subject: [Catalog-sig] Looking for help: Small PyPI change to fix setup.py
	register
Message-ID: <4C3D68F3.1040402@netwok.org>

Hello catalog people

Over there on distutils-sig?, someone pointed that using setup.py
register to create a new account did not work anymore. This was not
discovered before since most users probably already had an account or
used the Web page to create it. Remember that account creation has a
confirmation step that is always done via a Web page.

? http://mail.python.org/pipermail/distutils-sig/2010-July/016579.html

The reason that the Web interface still works and not the command-line
one is that now there is a small usage agreement to be read and
accepted, following advice from Van Lindbergh, the PSF lawyer. Greg
Ewing proposed to fix the setup.py register bug by moving the usage
agreement checkbox to the confirmation page. That way, every version of
distutils will work, and every registration (Web or command-line, any
version) will pass through the usage agreement. Van Lindbergh approved
the change.

Martin von L?wis asked for a patch. I grepped for strings seen on the
Web pages and URIs. If I had any METAL templates-fu, I?d try this way:

- remove the agreement text and checkbox from templates/register.pt

- add the agreement text and checkbox in a new form (with action
attribute set to /pypi and hidden input elements to get the URI
pypi?:action=user&otk=BLAH&agree=on) after line 2297 in webui/core.py
(or in a template or template block)

- move the test for the checkbox at lines 2355-2358 in webui/core.py
after line 2300 (with the right if blocks to check for 'otk' in
self.form and 'agree' in self.form)

I hope this can help someone write the patch.

Regards


From martin at v.loewis.de  Fri Jul 16 09:57:14 2010
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 16 Jul 2010 09:57:14 +0200
Subject: [Catalog-sig] PyPI mirror system is up
Message-ID: <4C4010DA.3040209@v.loewis.de>

The mirror system described in PEP 381 is implemented.
We have three mirror available for clients to use,
[bcd].pypi.python.org. These mirrors are restricted
(in conformace to PEP 381) to the /simple pages, i.e.
primarily intended for download through automated clients.

This service is considered experimental until some operational
experience is gained, and until the PEP gets marked as accepted.

Regards,
Martin

From fdrake at acm.org  Fri Jul 16 15:00:55 2010
From: fdrake at acm.org (Fred Drake)
Date: Fri, 16 Jul 2010 09:00:55 -0400
Subject: [Catalog-sig] PyPI mirror system is up
In-Reply-To: <4C4010DA.3040209@v.loewis.de>
References: <4C4010DA.3040209@v.loewis.de>
Message-ID: <AANLkTim9hoYE4bi1RCLcFBknqHCisEZVZFKn0NWYoMtO@mail.gmail.com>

On Fri, Jul 16, 2010 at 3:57 AM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> The mirror system described in PEP 381 is implemented.
> We have three mirror available for clients to use,
> [bcd].pypi.python.org.

Thanks, Martin!

Are the mirrors geographically dispersed?


? -Fred

--
Fred L. Drake, Jr.? ? <fdrake at gmail.com>
"A storm broke loose in my mind."? --Albert Einstein

From ziade.tarek at gmail.com  Fri Jul 16 15:40:34 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Fri, 16 Jul 2010 15:40:34 +0200
Subject: [Catalog-sig] PyPI mirror system is up
In-Reply-To: <AANLkTim9hoYE4bi1RCLcFBknqHCisEZVZFKn0NWYoMtO@mail.gmail.com>
References: <4C4010DA.3040209@v.loewis.de>
	<AANLkTim9hoYE4bi1RCLcFBknqHCisEZVZFKn0NWYoMtO@mail.gmail.com>
Message-ID: <AANLkTin3zdsiQQ_aOlNm553cUh5uwVru2rqufZaMyNhA@mail.gmail.com>

On Fri, Jul 16, 2010 at 3:00 PM, Fred Drake <fdrake at acm.org> wrote:
> On Fri, Jul 16, 2010 at 3:57 AM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> The mirror system described in PEP 381 is implemented.
>> We have three mirror available for clients to use,
>> [bcd].pypi.python.org.
>
> Thanks, Martin!
>
> Are the mirrors geographically dispersed?

IIRC There are two or three in Germany, I intend to run one in France.
We would need one in Asia from a trusted member of the community.

The next step will be to add a geoloc feature, but PyPI DNS doesn't
have such feature,
so I was thinking about having some kind of ping mechanism from the
client to sort
the mirrors by round trip duration.

There's a branch for mirroring support in distribute, and a first
version without the sorting, should be out soon.

Distutils2 already has mirrors support in trunk.

Cheers
Tarek
>
> ? -Fred
>
> --
> Fred L. Drake, Jr.? ? <fdrake at gmail.com>
> "A storm broke loose in my mind."? --Albert Einstein
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>



-- 
Tarek Ziad? | http://ziade.org

From ametaireau at gmail.com  Fri Jul 16 15:52:00 2010
From: ametaireau at gmail.com (=?UTF-8?Q?Alexis_M=C3=A9taireau?=)
Date: Fri, 16 Jul 2010 15:52:00 +0200
Subject: [Catalog-sig] PyPI mirror system is up
In-Reply-To: <AANLkTin3zdsiQQ_aOlNm553cUh5uwVru2rqufZaMyNhA@mail.gmail.com>
References: <4C4010DA.3040209@v.loewis.de>
	<AANLkTim9hoYE4bi1RCLcFBknqHCisEZVZFKn0NWYoMtO@mail.gmail.com> 
	<AANLkTin3zdsiQQ_aOlNm553cUh5uwVru2rqufZaMyNhA@mail.gmail.com>
Message-ID: <AANLkTilvaK5x3mHQ3jAxL1mV18KDR7nuotPEALJFH-Zo@mail.gmail.com>

> The next step will be to add a geoloc feature, but PyPI DNS doesn't
> have such feature,
> so I was thinking about having some kind of ping mechanism from the
> client to sort
> the mirrors by round trip duration.
>

Maybe can we have a look at the debian netselect package [1]. I'm thinking
about a simple tool to select the nearest/quicker server, and update a
configuration somewhere (something like ~/.config/distutils2/mirror.cfg),
then use it by default.

I'm updating the mirror support in distutils2 to reflect last PEP changes,
and will provide it in distutils2.mirrors.py, I guess.

[1] http://packages.debian.org/sid/netselect

Cheers,
Alexis
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100716/f27a61df/attachment.html>

From martin at v.loewis.de  Fri Jul 16 21:33:28 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 16 Jul 2010 21:33:28 +0200
Subject: [Catalog-sig] PyPI mirror system is up
In-Reply-To: <AANLkTim9hoYE4bi1RCLcFBknqHCisEZVZFKn0NWYoMtO@mail.gmail.com>
References: <4C4010DA.3040209@v.loewis.de>
	<AANLkTim9hoYE4bi1RCLcFBknqHCisEZVZFKn0NWYoMtO@mail.gmail.com>
Message-ID: <4C40B408.1030200@v.loewis.de>

Am 16.07.2010 15:00, schrieb Fred Drake:
> On Fri, Jul 16, 2010 at 3:57 AM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> The mirror system described in PEP 381 is implemented.
>> We have three mirror available for clients to use,
>> [bcd].pypi.python.org.
> 
> Thanks, Martin!
> 
> Are the mirrors geographically dispersed?

No, they are not.

Regards,
Martin


From ben+python at benfinney.id.au  Sat Jul 17 09:52:48 2010
From: ben+python at benfinney.id.au (Ben Finney)
Date: Sat, 17 Jul 2010 17:52:48 +1000
Subject: [Catalog-sig] PyPI mirror system is up
References: <4C4010DA.3040209@v.loewis.de>
Message-ID: <878w5aa6jz.fsf@benfinney.id.au>

"Martin v. L?wis" <martin at v.loewis.de> writes:

> The mirror system described in PEP 381 is implemented.

Great news, thanks for announcing this.

-- 
 \         ?In any great organization it is far, far safer to be wrong |
  `\          with the majority than to be right alone.? ?John Kenneth |
_o__)                                            Galbraith, 1989-07-28 |
Ben Finney


From jacob at jacobian.org  Mon Jul 19 19:34:40 2010
From: jacob at jacobian.org (Jacob Kaplan-Moss)
Date: Mon, 19 Jul 2010 10:34:40 -0700
Subject: [Catalog-sig] Monitoring for PyPI
Message-ID: <AANLkTinfSfSlgSpMqhzgLnZ8mA4AJ9w_wfrt1s5JVndJ@mail.gmail.com>

Hi folks --

So PyPI's down again this morning.

This isn't about that, though: apparently nobody with the appropriate
access knew anything about it until I asked in IRC and someone there
was able to ping Martin and get things up again.

That's some weaksauce right there. PyPI shouldn't rely on
monitoring-via-complaints-in-IRC.

I would be happy to set up a monitoring server to monitor PyPI's
availability and notify the appropriate parties via email, SMS, or any
other mechanism.

I can have this done by tomorrow afternoon if I can get the necessary
contact information and if the people with the appropriate access are
willing to be paged when things go down.

I'm going to start with the setup right now on the assumption that
this is a valuable service. If those with access are unwilling to be
notified, though, please let me know so I don't waste too much time.

Thanks!

Jacob

From martin at v.loewis.de  Mon Jul 19 19:38:15 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 19 Jul 2010 18:38:15 +0100
Subject: [Catalog-sig] Monitoring for PyPI
In-Reply-To: <AANLkTinfSfSlgSpMqhzgLnZ8mA4AJ9w_wfrt1s5JVndJ@mail.gmail.com>
References: <AANLkTinfSfSlgSpMqhzgLnZ8mA4AJ9w_wfrt1s5JVndJ@mail.gmail.com>
Message-ID: <4C448D87.70109@v.loewis.de>

> I'm going to start with the setup right now on the assumption that
> this is a valuable service. If those with access are unwilling to be
> notified, though, please let me know so I don't waste too much time.

Feel free to have this system email me. However, I didn't have internet
connectivity for much of the afternoon, so I wouldn't have been able
to receive the email.

Regards,
Martin


From ziade.tarek at gmail.com  Mon Jul 19 19:42:24 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Mon, 19 Jul 2010 19:42:24 +0200
Subject: [Catalog-sig] Monitoring for PyPI
In-Reply-To: <4C448D87.70109@v.loewis.de>
References: <AANLkTinfSfSlgSpMqhzgLnZ8mA4AJ9w_wfrt1s5JVndJ@mail.gmail.com>
	<4C448D87.70109@v.loewis.de>
Message-ID: <AANLkTinrTopwF240v9f164whBtFiHN2sAJTXrx9rbL5A@mail.gmail.com>

On Mon, Jul 19, 2010 at 7:38 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> I'm going to start with the setup right now on the assumption that
>> this is a valuable service. If those with access are unwilling to be
>> notified, though, please let me know so I don't waste too much time.
>
> Feel free to have this system email me. However, I didn't have internet
> connectivity for much of the afternoon, so I wouldn't have been able
> to receive the email.

I'd rather have an email here, at catalog-sig. So more people can act upon
(like, Richard, Janis, etc)

>
> Regards,
> Martin
>
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>



-- 
Tarek Ziad? | http://ziade.org

From fdrake at acm.org  Mon Jul 19 19:44:40 2010
From: fdrake at acm.org (Fred Drake)
Date: Mon, 19 Jul 2010 13:44:40 -0400
Subject: [Catalog-sig] Monitoring for PyPI
In-Reply-To: <AANLkTinfSfSlgSpMqhzgLnZ8mA4AJ9w_wfrt1s5JVndJ@mail.gmail.com>
References: <AANLkTinfSfSlgSpMqhzgLnZ8mA4AJ9w_wfrt1s5JVndJ@mail.gmail.com>
Message-ID: <AANLkTiljUoBrMgZgdDDGocDJ5isq6k5Cz-8-cSb3M8Ur@mail.gmail.com>

On Mon, Jul 19, 2010 at 1:34 PM, Jacob Kaplan-Moss <jacob at jacobian.org> wrote:
> I'm going to start with the setup right now on the assumption that
> this is a valuable service. If those with access are unwilling to be
> notified, though, please let me know so I don't waste too much time.

This will indeed be valuable.

Did we ever determine whether there are any other responders beyond
Martin?  I'd hate for him to have to deal with every outage.

Ideally, there'd be a list of responders with appropriate access, and
what hours each could be contacted.  If there's not one, I'm happy to
collect the information, but we'll need the responders to speak up so
we'll know about them.


? -Fred

--
Fred L. Drake, Jr.? ? <fdrake at gmail.com>
"A storm broke loose in my mind."? --Albert Einstein

From martin at v.loewis.de  Mon Jul 19 19:45:34 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 19 Jul 2010 18:45:34 +0100
Subject: [Catalog-sig] Monitoring for PyPI
In-Reply-To: <AANLkTinrTopwF240v9f164whBtFiHN2sAJTXrx9rbL5A@mail.gmail.com>
References: <AANLkTinfSfSlgSpMqhzgLnZ8mA4AJ9w_wfrt1s5JVndJ@mail.gmail.com>	<4C448D87.70109@v.loewis.de>
	<AANLkTinrTopwF240v9f164whBtFiHN2sAJTXrx9rbL5A@mail.gmail.com>
Message-ID: <4C448F3E.7020401@v.loewis.de>

>> Feel free to have this system email me. However, I didn't have internet
>> connectivity for much of the afternoon, so I wouldn't have been able
>> to receive the email.
>
> I'd rather have an email here, at catalog-sig. So more people can act upon
> (like, Richard, Janis, etc)

That might work as well. FWIW, I actually did get an email message from 
my own monitoring system, but wasn't able to act on it because I didn't
have internet connectivity.

It might also be useful to involve roto-rooters.

Regards,
Martin

From ziade.tarek at gmail.com  Mon Jul 19 19:57:16 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Mon, 19 Jul 2010 19:57:16 +0200
Subject: [Catalog-sig] Monitoring for PyPI
In-Reply-To: <4C448F3E.7020401@v.loewis.de>
References: <AANLkTinfSfSlgSpMqhzgLnZ8mA4AJ9w_wfrt1s5JVndJ@mail.gmail.com>
	<4C448D87.70109@v.loewis.de>
	<AANLkTinrTopwF240v9f164whBtFiHN2sAJTXrx9rbL5A@mail.gmail.com>
	<4C448F3E.7020401@v.loewis.de>
Message-ID: <AANLkTimOECSuTMZvosVeZd1-8vxjfkuX2GG89bg814zQ@mail.gmail.com>

2010/7/19 "Martin v. L?wis" <martin at v.loewis.de>:
>>> Feel free to have this system email me. However, I didn't have internet
>>> connectivity for much of the afternoon, so I wouldn't have been able
>>> to receive the email.
>>
>> I'd rather have an email here, at catalog-sig. So more people can act upon
>> (like, Richard, Janis, etc)
>
> That might work as well. FWIW, I actually did get an email message from my
> own monitoring system, but wasn't able to act on it because I didn't
> have internet connectivity.
>
> It might also be useful to involve roto-rooters.

What are those ? (http://en.wikipedia.org/wiki/Roto-Rooter ?! :) )

>
> Regards,
> Martin
>



-- 
Tarek Ziad? | http://ziade.org

From fdrake at acm.org  Mon Jul 19 20:16:45 2010
From: fdrake at acm.org (Fred Drake)
Date: Mon, 19 Jul 2010 14:16:45 -0400
Subject: [Catalog-sig] Monitoring for PyPI
In-Reply-To: <AANLkTimOECSuTMZvosVeZd1-8vxjfkuX2GG89bg814zQ@mail.gmail.com>
References: <AANLkTinfSfSlgSpMqhzgLnZ8mA4AJ9w_wfrt1s5JVndJ@mail.gmail.com> 
	<4C448D87.70109@v.loewis.de>
	<AANLkTinrTopwF240v9f164whBtFiHN2sAJTXrx9rbL5A@mail.gmail.com> 
	<4C448F3E.7020401@v.loewis.de>
	<AANLkTimOECSuTMZvosVeZd1-8vxjfkuX2GG89bg814zQ@mail.gmail.com>
Message-ID: <AANLkTilimIdI8SoGPt2NojpzKfu24YZ1XhE717tI8r_P@mail.gmail.com>

2010/7/19 Tarek Ziad? <ziade.tarek at gmail.com>:
> What are those ? (http://en.wikipedia.org/wiki/Roto-Rooter ?! :) )

Those folks who have root access on (at least) some of the python.org machines.

There's already an email alias for that group, and they're fairly
diverse geographically.


? -Fred

--
Fred L. Drake, Jr.? ? <fdrake at gmail.com>
"A storm broke loose in my mind."? --Albert Einstein

From tjreedy at udel.edu  Mon Jul 19 21:26:46 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 19 Jul 2010 15:26:46 -0400
Subject: [Catalog-sig] Monitoring for PyPI
In-Reply-To: <4C448D87.70109@v.loewis.de>
References: <AANLkTinfSfSlgSpMqhzgLnZ8mA4AJ9w_wfrt1s5JVndJ@mail.gmail.com>
	<4C448D87.70109@v.loewis.de>
Message-ID: <i228tl$pg6$1@dough.gmane.org>

On 7/19/2010 1:38 PM, "Martin v. L?wis" wrote:
>> I'm going to start with the setup right now on the assumption that
>> this is a valuable service. If those with access are unwilling to be
>> notified, though, please let me know so I don't waste too much time.
>
> Feel free to have this system email me. However, I didn't have internet
> connectivity for much of the afternoon, so I wouldn't have been able
> to receive the email.

Is is possible to put the restart routine into a script, that could be 
invoked by a mail filter, or does the process require human monitoring?

-- 
Terry Jan Reedy



From ianb at colorstudy.com  Mon Jul 19 21:51:24 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon, 19 Jul 2010 14:51:24 -0500
Subject: [Catalog-sig] Monitoring for PyPI
In-Reply-To: <i228tl$pg6$1@dough.gmane.org>
References: <AANLkTinfSfSlgSpMqhzgLnZ8mA4AJ9w_wfrt1s5JVndJ@mail.gmail.com> 
	<4C448D87.70109@v.loewis.de> <i228tl$pg6$1@dough.gmane.org>
Message-ID: <AANLkTinU4-v3lFdJWtoNo7r8azr10FJbQlGDWMmePMFS@mail.gmail.com>

Just for the record, what was the resolution for this downtime?

  Ian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100719/4380f38f/attachment.html>

From martin at v.loewis.de  Tue Jul 20 00:20:25 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Mon, 19 Jul 2010 23:20:25 +0100
Subject: [Catalog-sig] Monitoring for PyPI
In-Reply-To: <AANLkTinU4-v3lFdJWtoNo7r8azr10FJbQlGDWMmePMFS@mail.gmail.com>
References: <AANLkTinfSfSlgSpMqhzgLnZ8mA4AJ9w_wfrt1s5JVndJ@mail.gmail.com>
	<4C448D87.70109@v.loewis.de> <i228tl$pg6$1@dough.gmane.org>
	<AANLkTinU4-v3lFdJWtoNo7r8azr10FJbQlGDWMmePMFS@mail.gmail.com>
Message-ID: <4C44CFA9.7000701@v.loewis.de>

Am 19.07.10 20:51, schrieb Ian Bicking:
> Just for the record, what was the resolution for this downtime?

I restarted Apache.

Regards,
Martin

From martin at v.loewis.de  Tue Jul 20 00:22:23 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 19 Jul 2010 23:22:23 +0100
Subject: [Catalog-sig] Monitoring for PyPI
In-Reply-To: <AANLkTiljUoBrMgZgdDDGocDJ5isq6k5Cz-8-cSb3M8Ur@mail.gmail.com>
References: <AANLkTinfSfSlgSpMqhzgLnZ8mA4AJ9w_wfrt1s5JVndJ@mail.gmail.com>
	<AANLkTiljUoBrMgZgdDDGocDJ5isq6k5Cz-8-cSb3M8Ur@mail.gmail.com>
Message-ID: <4C44D01F.9080709@v.loewis.de>

> Did we ever determine whether there are any other responders beyond
> Martin?

Jannis Leidel is now looking into this also. IIUC, we are both at
EuroPython right now, so neither of us had Internet connectivity
when the outage started.

> Ideally, there'd be a list of responders with appropriate access, and
> what hours each could be contacted.  If there's not one, I'm happy to
> collect the information, but we'll need the responders to speak up so
> we'll know about them.

Please go ahead.

Regards,
Martin


From jannis at leidel.info  Tue Jul 20 10:42:52 2010
From: jannis at leidel.info (Jannis Leidel)
Date: Tue, 20 Jul 2010 09:42:52 +0100
Subject: [Catalog-sig] Monitoring for PyPI
In-Reply-To: <4C44D01F.9080709@v.loewis.de>
References: <AANLkTinfSfSlgSpMqhzgLnZ8mA4AJ9w_wfrt1s5JVndJ@mail.gmail.com>
	<AANLkTiljUoBrMgZgdDDGocDJ5isq6k5Cz-8-cSb3M8Ur@mail.gmail.com>
	<4C44D01F.9080709@v.loewis.de>
Message-ID: <B3843FDA-A207-4AA6-880F-2B25303B3FDB@leidel.info>

Am 19.07.2010 um 23:22 schrieb Martin v. L?wis:

>> Did we ever determine whether there are any other responders beyond
>> Martin?
> 
> Jannis Leidel is now looking into this also. IIUC, we are both at
> EuroPython right now, so neither of us had Internet connectivity
> when the outage started.

Indeed. 

>> Ideally, there'd be a list of responders with appropriate access, and
>> what hours each could be contacted.  If there's not one, I'm happy to
>> collect the information, but we'll need the responders to speak up so
>> we'll know about them.
> 
> Please go ahead.

+1

From pnasrat at google.com  Wed Jul 21 14:10:50 2010
From: pnasrat at google.com (Paul Nasrat)
Date: Wed, 21 Jul 2010 13:10:50 +0100
Subject: [Catalog-sig] Mirror list detection/construction - PEP 381
Message-ID: <AANLkTikiOJ12EJaqYw-_zid0qad7t-_c8CfXPjJz6Qur@mail.gmail.com>

I was looking through PEP 381, which gives the following about mirror
list construction for clients:

Clients that are browsing PyPI should be able to use alternative
mirrors, by getting the list of the mirrors using `last.pypi.python.org`.

Code example::

    >>> import socket
    >>> socket.gethostbyname_ex('last.pypi.python.org')[0]
    'h.pypi.python.org'

My reading of this is that the intent is for a client to be able to
resolve this to find the last mirror, eg h,zz, etc. Obviously smart
clients can then use this information to figure closest/fastest mirror
etc.

As documented this is not a robust way to resolve this, on OS X I get
the following:

>>> import socket
>>> socket.gethostbyname_ex('last.pypi.python.org')
('pypi.websushi.org',
 ['last.pypi.python.org', 'd.pypi.python.org'],
 ['88.198.109.79'])

Which resolves to the actual host not the intermediate alias, but the
alias is preserved in the aliaslist. Whilst discussing this on
#distutils we discovered that some resolvers behave quite differently:

Python 2.6.5 on Windows 2008R2
>>> socket.gethostbyname_ex('last.pypi.python.org')
('pypi.websushi.org', [], ['88.198.109.79'])

Given the fragility of this it seems that we might want to consider
alternative mirrorlist discovery mechanism. Talking with Alexis he'd
already implemented mirrorlist construction for distutils2

http://bitbucket.org/ametaireau/distutils2/src/tip/src/distutils2/index/mirrors.py

There should be a reliable way to construct the list for clients.
Mechanisms could be a static page to scrape, a known url with
redirect, DNS SRV records or something else.

Thoughts?

Paul

From jcea at jcea.es  Wed Jul 21 14:57:59 2010
From: jcea at jcea.es (Jesus Cea)
Date: Wed, 21 Jul 2010 14:57:59 +0200
Subject: [Catalog-sig] Mirror list detection/construction - PEP 381
In-Reply-To: <AANLkTikiOJ12EJaqYw-_zid0qad7t-_c8CfXPjJz6Qur@mail.gmail.com>
References: <AANLkTikiOJ12EJaqYw-_zid0qad7t-_c8CfXPjJz6Qur@mail.gmail.com>
Message-ID: <4C46EED7.5030908@jcea.es>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 21/07/10 14:10, Paul Nasrat wrote:
> There should be a reliable way to construct the list for clients.
> Mechanisms could be a static page to scrape, a known url with
> redirect, DNS SRV records or something else.

I am a BIG fan of DNS SRV records (to the point of promoting replacing
MX records, or "www" prefixes in urls), but most resolvers will not
provide them to regular applications (you have no easy ways to query for
a SVR record, using standard APIs). For instance, python applications.
You would need to bundle the "dnspython" library, or something similar.

I would absolutely LOVE to be proved the contrary.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at jcea.es - http://www.jcea.es/     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at jabber.org         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBTEbu15lgi5GaxT1NAQIGpQP9HJNn2mnSEV8yU3oFHK31cLI4UGefotyo
zGItj7L42OtoFjnOt2ETSmJce8ytJ2OSaP5Fx5seTz/dd6agL/OA+6zBwdso5hHW
9pRno8Z7wdujYYpe1gV3jq5cmM5uij8KRLWyfKdG79sNV+lKQ3nwtPMB+Gnh/hbc
e0ZwXI4fwOo=
=BvwQ
-----END PGP SIGNATURE-----

From pje at telecommunity.com  Wed Jul 21 21:20:17 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Wed, 21 Jul 2010 15:20:17 -0400
Subject: [Catalog-sig] Mirror list detection/construction - PEP 381
In-Reply-To: <AANLkTikiOJ12EJaqYw-_zid0qad7t-_c8CfXPjJz6Qur@mail.gmail.c
 om>
References: <AANLkTikiOJ12EJaqYw-_zid0qad7t-_c8CfXPjJz6Qur@mail.gmail.com>
Message-ID: <20100721192023.B5E373A409B@sparrow.telecommunity.com>

At 01:10 PM 7/21/2010 +0100, Paul Nasrat wrote:
>Given the fragility of this it seems that we might want to consider
>alternative mirrorlist discovery mechanism.

As a fallback, you can of course probe addresses on a binary search, 
or even just select a random mirror in the first place.  (If you 
overshoot the list, you just decrease the max on your binary lookup.)

Non-random selection is tougher to implement, since you'd need to 
keep some kind of history to make it work effectively.  Determining 
the length of the list is a trivial problem by comparison. 


From pnasrat at google.com  Thu Jul 22 01:32:02 2010
From: pnasrat at google.com (Paul Nasrat)
Date: Thu, 22 Jul 2010 00:32:02 +0100
Subject: [Catalog-sig] Mirror list detection/construction - PEP 381
In-Reply-To: <20100721192023.B5E373A409B@sparrow.telecommunity.com>
References: <AANLkTikiOJ12EJaqYw-_zid0qad7t-_c8CfXPjJz6Qur@mail.gmail.com>
	<20100721192023.B5E373A409B@sparrow.telecommunity.com>
Message-ID: <AANLkTikc7YGu+hN9AJxe004hvwS0KcAnSUjAbxHa1Sdn@mail.gmail.com>

On Wed, Jul 21, 2010 at 8:20 PM, P.J. Eby <pje at telecommunity.com> wrote:
> At 01:10 PM 7/21/2010 +0100, Paul Nasrat wrote:
>>
>> Given the fragility of this it seems that we might want to consider
>> alternative mirrorlist discovery mechanism.

> Non-random selection is tougher to implement, since you'd need to keep some
> kind of history to make it work effectively. ?Determining the length of the
> list is a trivial problem by comparison.

Sure, it's not a hard problem in terms of computer science, but having
a well defined way to do this for mirroring, dependency resolving and
other clients seems like a reasonable request. But that selection is
going to be the responsibility of the clients, some may take the hit
to maintain a history and periodically update  or generate a confing
(cf fastestmirror plugin to yum or netselect-apt).

>From just picking up the PEP, reading it and using the information
therein to write a client implementation I think that the current
documented client code snippet does not do what the description
intends. It could be I'm misreading this, so if it is not the intent
that clients should be able to generate a list of mirrors to operate
on via last.pypi.python.org.

Is there any technical reason you'd want pypi clients to binary search
DNS to find the mirror end rather than a more directed lookup?

Paul

From jcea at jcea.es  Thu Jul 22 02:52:51 2010
From: jcea at jcea.es (Jesus Cea)
Date: Thu, 22 Jul 2010 02:52:51 +0200
Subject: [Catalog-sig] Mirror list detection/construction - PEP 381
In-Reply-To: <20100721192023.B5E373A409B@sparrow.telecommunity.com>
References: <AANLkTikiOJ12EJaqYw-_zid0qad7t-_c8CfXPjJz6Qur@mail.gmail.com>
	<20100721192023.B5E373A409B@sparrow.telecommunity.com>
Message-ID: <4C479663.5000304@jcea.es>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 21/07/10 21:20, P.J. Eby wrote:
> At 01:10 PM 7/21/2010 +0100, Paul Nasrat wrote:
>> Given the fragility of this it seems that we might want to consider
>> alternative mirrorlist discovery mechanism.
> 
> As a fallback, you can of course probe addresses on a binary search, or
> even just select a random mirror in the first place.  (If you overshoot
> the list, you just decrease the max on your binary lookup.)
> 
> Non-random selection is tougher to implement, since you'd need to keep
> some kind of history to make it work effectively.  Determining the
> length of the list is a trivial problem by comparison.

Well a random shuffle is a standard operation in "random" module :-).

What I usually do is to pick a random server and my previous selection
(the first time, you choose a second random server). Then do some
operations in BOTH and choose the faster. Complete the operation on it
and keep it for the next time.

This way you always go the fast route, but randomly and gently probe
other nodes trying to find even other faster.

Since using pypi is critical but not very frequent (nobody is going to
install a hundred modules daily), being smart is, maybe, overkill. Could
be enough just pick 5 nodes at random, check them simultaneously
(threads!), choose one and complete the download from there. If the
download fails, repeat from the very beginning a few times.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at jcea.es - http://www.jcea.es/     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at jabber.org         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBTEeWY5lgi5GaxT1NAQI4/gP+Ibqmsqel2tiEN6zdmWWZIKJsfoKS+u8x
B9XEHvrnF54YFU+GJOqcEmRRsHndy2DKMcWpH0t1wzMSIZlGbdOqtqG47b+KJfnD
8Qun5f5oWvyQXPoPSUqngQHQx7mPOQhTTUP75oeduTjMm6NOBzsbIDhKTDfiSP9g
RzYQo+QK3u8=
=ENnz
-----END PGP SIGNATURE-----

From pje at telecommunity.com  Thu Jul 22 03:55:49 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Wed, 21 Jul 2010 21:55:49 -0400
Subject: [Catalog-sig] Mirror list detection/construction - PEP 381
In-Reply-To: <4C479663.5000304@jcea.es>
References: <AANLkTikiOJ12EJaqYw-_zid0qad7t-_c8CfXPjJz6Qur@mail.gmail.com>
	<20100721192023.B5E373A409B@sparrow.telecommunity.com>
	<4C479663.5000304@jcea.es>
Message-ID: <20100722015545.19E1C3A4119@sparrow.telecommunity.com>

At 02:52 AM 7/22/2010 +0200, Jesus Cea wrote:
>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>On 21/07/10 21:20, P.J. Eby wrote:
> > At 01:10 PM 7/21/2010 +0100, Paul Nasrat wrote:
> >> Given the fragility of this it seems that we might want to consider
> >> alternative mirrorlist discovery mechanism.
> >
> > As a fallback, you can of course probe addresses on a binary search, or
> > even just select a random mirror in the first place.  (If you overshoot
> > the list, you just decrease the max on your binary lookup.)
> >
> > Non-random selection is tougher to implement, since you'd need to keep
> > some kind of history to make it work effectively.  Determining the
> > length of the list is a trivial problem by comparison.
>
>Well a random shuffle is a standard operation in "random" module :-).
>
>What I usually do is to pick a random server and my previous selection
>(the first time, you choose a second random server). Then do some
>operations in BOTH and choose the faster. Complete the operation on it
>and keep it for the next time.
>
>This way you always go the fast route, but randomly and gently probe
>other nodes trying to find even other faster.

Not a bad idea.  My main sticking point for adding this to 
easy_install is that it doesn't currently maintain any state like 
this, and there's no obvious place to put it.  Silently rewriting 
config files would be evil, and given that distutils has three layers 
of config files, it's never really clear which one you'd want to 
write to anyway.  Most likely, I'll need to just use "try the default 
or specified one first, then fall back to randomly-selected mirrors.


From jcea at jcea.es  Thu Jul 22 05:10:24 2010
From: jcea at jcea.es (Jesus Cea)
Date: Thu, 22 Jul 2010 05:10:24 +0200
Subject: [Catalog-sig] Mirror list detection/construction - PEP 381
In-Reply-To: <20100722015545.19E1C3A4119@sparrow.telecommunity.com>
References: <AANLkTikiOJ12EJaqYw-_zid0qad7t-_c8CfXPjJz6Qur@mail.gmail.com>	<20100721192023.B5E373A409B@sparrow.telecommunity.com>	<4C479663.5000304@jcea.es>
	<20100722015545.19E1C3A4119@sparrow.telecommunity.com>
Message-ID: <4C47B6A0.2060501@jcea.es>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 22/07/10 03:55, P.J. Eby wrote:
> Not a bad idea.  My main sticking point for adding this to easy_install
> is that it doesn't currently maintain any state like this, and there's
> no obvious place to put it.  Silently rewriting config files would be
> evil, and given that distutils has three layers of config files, it's
> never really clear which one you'd want to write to anyway.  Most
> likely, I'll need to just use "try the default or specified one first,
> then fall back to randomly-selected mirrors.

You can try all in parallel (threads!), choose the faster and send the
request to it. If that fails, try again, discarding that node. If you
have discarded all servers, start over again a couple of times. If that
fails, report failure to the user and surrender.

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at jcea.es - http://www.jcea.es/     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at jabber.org         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBTEe2oJlgi5GaxT1NAQJX/gQAkXPc71I317wpv2VPyrh9sErHWWG27vXb
owuAvwaQkZV2WvQBILmyBP9Avu6W2ZPwej+R+CGaK+unnJhUaczinvHozqTNwHUM
fTd4Fc6GNJORsQPxupbvMfaVJykbnM6LdoXTdW2y/nu/ck9YxIyalQ2Q00WAoTL0
uBsd0Mc08VQ=
=1Yzm
-----END PGP SIGNATURE-----

From pje at telecommunity.com  Thu Jul 22 05:39:54 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Wed, 21 Jul 2010 23:39:54 -0400
Subject: [Catalog-sig] Mirror list detection/construction - PEP 381
In-Reply-To: <4C47B6A0.2060501@jcea.es>
References: <AANLkTikiOJ12EJaqYw-_zid0qad7t-_c8CfXPjJz6Qur@mail.gmail.com>
	<20100721192023.B5E373A409B@sparrow.telecommunity.com>
	<4C479663.5000304@jcea.es>
	<20100722015545.19E1C3A4119@sparrow.telecommunity.com>
	<4C47B6A0.2060501@jcea.es>
Message-ID: <20100722033958.214343A409B@sparrow.telecommunity.com>

At 05:10 AM 7/22/2010 +0200, Jesus Cea wrote:
>On 22/07/10 03:55, P.J. Eby wrote:
> > Not a bad idea.  My main sticking point for adding this to easy_install
> > is that it doesn't currently maintain any state like this, and there's
> > no obvious place to put it.  Silently rewriting config files would be
> > evil, and given that distutils has three layers of config files, it's
> > never really clear which one you'd want to write to anyway.  Most
> > likely, I'll need to just use "try the default or specified one first,
> > then fall back to randomly-selected mirrors.
>
>You can try all in parallel (threads!), choose the faster and send the
>request to it. If that fails, try again, discarding that node. If you
>have discarded all servers, start over again a couple of times. If that
>fails, report failure to the user and surrender.

Are you actually suggesting I add threads to *easy_install*?

I'm not sure I could handle that kind of excitement.  ;-)


From martin at v.loewis.de  Thu Jul 22 09:59:06 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 22 Jul 2010 08:59:06 +0100
Subject: [Catalog-sig] Mirror list detection/construction - PEP 381
In-Reply-To: <20100721192023.B5E373A409B@sparrow.telecommunity.com>
References: <AANLkTikiOJ12EJaqYw-_zid0qad7t-_c8CfXPjJz6Qur@mail.gmail.com>
	<20100721192023.B5E373A409B@sparrow.telecommunity.com>
Message-ID: <4C47FA4A.7010902@v.loewis.de>

>> Given the fragility of this it seems that we might want to consider
>> alternative mirrorlist discovery mechanism.
>
> As a fallback, you can of course probe addresses on a binary search, or
> even just select a random mirror in the first place. (If you overshoot
> the list, you just decrease the max on your binary lookup.)

I think you misunderstood the question. The issue is not at all what 
mirrors to use once you know what they are - the question is how to find
out what mirrors exist in the first place.

When you know what they are, I do support the idea of trying them all
simultaneously, though not in parallel. With sockets, you can have many
open without having to use threads.

I'll post some code that does that shortly.

Regards,
Martin

From ziade.tarek at gmail.com  Thu Jul 22 12:01:31 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Thu, 22 Jul 2010 12:01:31 +0200
Subject: [Catalog-sig] Mirror list detection/construction - PEP 381
In-Reply-To: <4C479663.5000304@jcea.es>
References: <AANLkTikiOJ12EJaqYw-_zid0qad7t-_c8CfXPjJz6Qur@mail.gmail.com>
	<20100721192023.B5E373A409B@sparrow.telecommunity.com>
	<4C479663.5000304@jcea.es>
Message-ID: <AANLkTinOseoA4dBb-_kkQU3Nzoroajiv4aq2-9jF4DLj@mail.gmail.com>

On Thu, Jul 22, 2010 at 2:52 AM, Jesus Cea <jcea at jcea.es> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 21/07/10 21:20, P.J. Eby wrote:
>> At 01:10 PM 7/21/2010 +0100, Paul Nasrat wrote:
>>> Given the fragility of this it seems that we might want to consider
>>> alternative mirrorlist discovery mechanism.
>>
>> As a fallback, you can of course probe addresses on a binary search, or
>> even just select a random mirror in the first place. ?(If you overshoot
>> the list, you just decrease the max on your binary lookup.)
>>
>> Non-random selection is tougher to implement, since you'd need to keep
>> some kind of history to make it work effectively. ?Determining the
>> length of the list is a trivial problem by comparison.
>
> Well a random shuffle is a standard operation in "random" module :-).
>
> What I usually do is to pick a random server and my previous selection
> (the first time, you choose a second random server). Then do some
> operations in BOTH and choose the faster. Complete the operation on it
> and keep it for the next time.
>
> This way you always go the fast route, but randomly and gently probe
> other nodes trying to find even other faster.

There's one more parameter to take into account I guess: the "freshness"
of the mirror, e.g. when it was last synced.

From jcea at jcea.es  Thu Jul 22 18:13:41 2010
From: jcea at jcea.es (Jesus Cea)
Date: Thu, 22 Jul 2010 18:13:41 +0200
Subject: [Catalog-sig] Mirror list detection/construction - PEP 381
In-Reply-To: <20100722033958.214343A409B@sparrow.telecommunity.com>
References: <AANLkTikiOJ12EJaqYw-_zid0qad7t-_c8CfXPjJz6Qur@mail.gmail.com>
	<20100721192023.B5E373A409B@sparrow.telecommunity.com>
	<4C479663.5000304@jcea.es>
	<20100722015545.19E1C3A4119@sparrow.telecommunity.com>
	<4C47B6A0.2060501@jcea.es>
	<20100722033958.214343A409B@sparrow.telecommunity.com>
Message-ID: <4C486E35.1080308@jcea.es>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 22/07/10 05:39, P.J. Eby wrote:
> Are you actually suggesting I add threads to *easy_install*?
> 
> I'm not sure I could handle that kind of excitement.  ;-)

Well, all threads would execute the same 5 lines of code, only
connecting with a 10 seconds timeout and put in a queue the time the
server taked to reply. Or the timeout notificaction. The queue is read
by the main thread, only interested in the first entry :).

Pretty unexciting O:-)

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea at jcea.es - http://www.jcea.es/     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea at jabber.org         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBTEhuNZlgi5GaxT1NAQKW4AQAh77EsoE1yvB4SIo+V1W3Nbpo22/H7Pqp
pZj0OgJJg5IIDaTM9su5Q0V/XOcCK97xOoaCnD7W/7nl5Boke7AJLXguzu0VTBIu
w+qcmZuqWyraMcoj5cwUtWhXygLzPaanDzwtSAsRCmFbjpL5hsgKmc5HaAe6QVGG
b0YxPWuSrLU=
=WqXD
-----END PGP SIGNATURE-----

From martin at v.loewis.de  Thu Jul 22 14:53:08 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 22 Jul 2010 13:53:08 +0100
Subject: [Catalog-sig] Mirror list detection/construction - PEP 381
In-Reply-To: <AANLkTikiOJ12EJaqYw-_zid0qad7t-_c8CfXPjJz6Qur@mail.gmail.com>
References: <AANLkTikiOJ12EJaqYw-_zid0qad7t-_c8CfXPjJz6Qur@mail.gmail.com>
Message-ID: <4C483F34.5070503@v.loewis.de>

> Thoughts?

I've been thinking that *.pypi.python.org should always
yield A records, not CNAMEs.

It may be that this becomes difficult with Google appengine,
though.

Regards,
Martin

From jacob at jacobian.org  Thu Jul 22 18:57:32 2010
From: jacob at jacobian.org (Jacob Kaplan-Moss)
Date: Thu, 22 Jul 2010 09:57:32 -0700
Subject: [Catalog-sig] Monitoring for PyPI
In-Reply-To: <AANLkTinrTopwF240v9f164whBtFiHN2sAJTXrx9rbL5A@mail.gmail.com>
References: <AANLkTinfSfSlgSpMqhzgLnZ8mA4AJ9w_wfrt1s5JVndJ@mail.gmail.com> 
	<4C448D87.70109@v.loewis.de>
	<AANLkTinrTopwF240v9f164whBtFiHN2sAJTXrx9rbL5A@mail.gmail.com>
Message-ID: <AANLkTil1wSrc6WJKOJgSwZzNaa-OYmNE-wt5u9a0e0Ov@mail.gmail.com>

2010/7/19 Tarek Ziad? <ziade.tarek at gmail.com>:
> I'd rather have an email here, at catalog-sig. So more people can act upon
> (like, Richard, Janis, etc)

I can do that, though it'll increase slightly the noise on the list.
Are we okay with that?

Jacob

From ziade.tarek at gmail.com  Thu Jul 22 19:03:37 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Thu, 22 Jul 2010 19:03:37 +0200
Subject: [Catalog-sig] Monitoring for PyPI
In-Reply-To: <AANLkTil1wSrc6WJKOJgSwZzNaa-OYmNE-wt5u9a0e0Ov@mail.gmail.com>
References: <AANLkTinfSfSlgSpMqhzgLnZ8mA4AJ9w_wfrt1s5JVndJ@mail.gmail.com>
	<4C448D87.70109@v.loewis.de>
	<AANLkTinrTopwF240v9f164whBtFiHN2sAJTXrx9rbL5A@mail.gmail.com>
	<AANLkTil1wSrc6WJKOJgSwZzNaa-OYmNE-wt5u9a0e0Ov@mail.gmail.com>
Message-ID: <AANLkTimOv6QT89wjt3x36M76ktjUKAFW2tR1jKkZqUw0@mail.gmail.com>

On Thu, Jul 22, 2010 at 6:57 PM, Jacob Kaplan-Moss <jacob at jacobian.org> wrote:
> 2010/7/19 Tarek Ziad? <ziade.tarek at gmail.com>:
>> I'd rather have an email here, at catalog-sig. So more people can act upon
>> (like, Richard, Janis, etc)
>
> I can do that, though it'll increase slightly the noise on the list.
> Are we okay with that?

As long as we just get emails when something is wrong, I think its a
good kind of noise

Regards,
Tarek

>
> Jacob
>



-- 
Tarek Ziad? | http://ziade.org

From jacob at jacobian.org  Thu Jul 22 20:02:46 2010
From: jacob at jacobian.org (Jacob Kaplan-Moss)
Date: Thu, 22 Jul 2010 11:02:46 -0700
Subject: [Catalog-sig] Monitoring for PyPI
In-Reply-To: <AANLkTinfSfSlgSpMqhzgLnZ8mA4AJ9w_wfrt1s5JVndJ@mail.gmail.com>
References: <AANLkTinfSfSlgSpMqhzgLnZ8mA4AJ9w_wfrt1s5JVndJ@mail.gmail.com>
Message-ID: <AANLkTinYQbmBx3Q7DPKm_cEki4RHbx0-3bF9OOiI_PUO@mail.gmail.com>

Hi folks --

An update:

I've set up a monitoring service. You can see a status dashboard at
http://monitor.jacobian.org/ -- use pypi/pypi for access. DNS is still
propagating, so it may take a bit for that name to work for you.

I'm watching the master pypi as well as all of the mirrors I know
about. If there are more mirrors, tell me and I'll add them.

Notifications right now go out to me and to catalog-sig. The'll be
coming from monitor at jacobian.org, so a moderator probably needs to
whiltelist that address to post to this list. I've got things set up
to not be noisy unless there's a bonafide failure.

If anyone else wants to be notified about one or more mirrors, send me
an email off-list with the email you'd like to be notified.

If anyone's interested, I'm using monit (http://mmonit.com/monit), and
you can find a (lightly-elided) config at
http://gist.github.com/486338.

Questions? Comments? Concerns?

Jacob

From pje at telecommunity.com  Thu Jul 22 20:20:40 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Thu, 22 Jul 2010 14:20:40 -0400
Subject: [Catalog-sig] Mirror list detection/construction - PEP 381
Message-ID: <20100722182029.D5FC43A40DF@sparrow.telecommunity.com>

At 08:59 AM 7/22/2010 +0100, Martin v. L?wis wrote:
>>>Given the fragility of this it seems that we might want to consider
>>>alternative mirrorlist discovery mechanism.
>>
>>As a fallback, you can of course probe addresses on a binary search, or
>>even just select a random mirror in the first place. (If you overshoot
>>the list, you just decrease the max on your binary lookup.)
>
>I think you misunderstood the question. The issue is not at all what 
>mirrors to use once you know what they are - the question is how to find
>out what mirrors exist in the first place.

Yes -- which can be done by probing.  If, say, 'x.pypi.python.org' 
doesn't exist, that gives you an upper bound on how many mirrors must exist.

If you are only going to choose a random mirror anyway, then simply 
choosing one at random to start (and adjusting your limit down if you 
overshoot) will accomplish both at the same time.


>When you know what they are, I do support the idea of trying them all
>simultaneously, though not in parallel. With sockets, you can have many
>open without having to use threads.

Interesting.  I wonder how easy that will be to do in a 
cross-platform manner; I seem to recall that Twisted had some rather 
convoluted code to set socket options correctly for this sort of 
thing on different platforms.

Of course, if you are doing this, then you could also simply begin at 
'a' and add hosts to your list until you either get a failed lookup, 
or you get an IP that matches that of 'last.pypi.python.org'.


>I'll post some code that does that shortly.
>
>Regards,
>Martin


From ziade.tarek at gmail.com  Thu Jul 22 20:23:07 2010
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Thu, 22 Jul 2010 20:23:07 +0200
Subject: [Catalog-sig] Monitoring for PyPI
In-Reply-To: <AANLkTinYQbmBx3Q7DPKm_cEki4RHbx0-3bF9OOiI_PUO@mail.gmail.com>
References: <AANLkTinfSfSlgSpMqhzgLnZ8mA4AJ9w_wfrt1s5JVndJ@mail.gmail.com>
	<AANLkTinYQbmBx3Q7DPKm_cEki4RHbx0-3bF9OOiI_PUO@mail.gmail.com>
Message-ID: <AANLkTinvDXylGIqyRarRLnq0epu_J5UR_MHxzlnNiLPC@mail.gmail.com>

On Thu, Jul 22, 2010 at 8:02 PM, Jacob Kaplan-Moss <jacob at jacobian.org> wrote:
[..]
>
> Questions? Comments? Concerns?

Thanks !

>
> Jacob
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>



-- 
Tarek Ziad? | http://ziade.org

From armin.ronacher at active-4.com  Mon Jul 26 00:41:06 2010
From: armin.ronacher at active-4.com (Armin Ronacher)
Date: Sun, 25 Jul 2010 22:41:06 +0000 (UTC)
Subject: [Catalog-sig] Proposal to Reverse Ordering of Scraped Links in PyPI
Message-ID: <loom.20100726T003854-258@post.gmane.org>

Hi,

I guess we can all agree on the fact that setuptools/distribute's way to find
packages and package versions is an interesting but unreliable hack.  However
the problem here is that right now many people depend on setuptools and there is
a problem in the combination of setuptools and PyPI and the best way to fix this
is PyPI.

First the problem: setuptools knows the concept of encoding special versions of
packages into URLs in package descriptions.  Many people use that to refer to
development versions.  However if you do that, and the URL changes in a later
version, on the simple index all links will still be present.

In combination with setuptools/distribute's behaviour of using the last match,
this means that you are unable to change the link unless you delete the old
description as well.

As a fix I would recommend just listing the latest links on the simple page or
to reverse the order so that easy_install picks up the right one.
 
Here is an example of such a problematic PyPI item:

  http://pypi.python.org/simple/Flask-Babel/

Obviously, the link to mitsuhiko/flask-sqlalchemy is correct and
USERNAME/REPOSITORY is wrong :)

Is that something that could be changed in PyPI or would that go into a new
version of setuptools/distribute?  Right now it seems like the only solution is
to either delete or edit old entries to fix the links.


Regards,
Armin



From fuzzyman at voidspace.org.uk  Mon Jul 26 00:53:46 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Sun, 25 Jul 2010 23:53:46 +0100
Subject: [Catalog-sig] Proposal to Reverse Ordering of Scraped Links in
	PyPI
In-Reply-To: <loom.20100726T003854-258@post.gmane.org>
References: <loom.20100726T003854-258@post.gmane.org>
Message-ID: <AANLkTikAwxG0KU_CvHh32HXS9+tN2Tit3Jb-WqOsRA+Z@mail.gmail.com>

On 25 July 2010 23:41, Armin Ronacher <armin.ronacher at active-4.com> wrote:

> Hi,
>
> I guess we can all agree on the fact that setuptools/distribute's way to
> find
> packages and package versions is an interesting but unreliable hack.
>  However
> the problem here is that right now many people depend on setuptools and
> there is
> a problem in the combination of setuptools and PyPI and the best way to fix
> this
> is PyPI.
>
>

Heh pip, distribute and setuptools *really* shouldn't be scraping pypi - at
least except as a last resort. pypi has an xml-rpc (and a fledgling json
api) which should be used in preference to scraping. This *already* causes
problems for pypi maintenance.

Michael Foord



> First the problem: setuptools knows the concept of encoding special
> versions of
> packages into URLs in package descriptions.  Many people use that to refer
> to
> development versions.  However if you do that, and the URL changes in a
> later
> version, on the simple index all links will still be present.
>
> In combination with setuptools/distribute's behaviour of using the last
> match,
> this means that you are unable to change the link unless you delete the old
> description as well.
>
> As a fix I would recommend just listing the latest links on the simple page
> or
> to reverse the order so that easy_install picks up the right one.
>
> Here is an example of such a problematic PyPI item:
>
>  http://pypi.python.org/simple/Flask-Babel/
>
> Obviously, the link to mitsuhiko/flask-sqlalchemy is correct and
> USERNAME/REPOSITORY is wrong :)
>
> Is that something that could be changed in PyPI or would that go into a new
> version of setuptools/distribute?  Right now it seems like the only
> solution is
> to either delete or edit old entries to fix the links.
>
>
> Regards,
> Armin
>
>
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig
>



-- 
http://www.voidspace.org.uk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100725/e380e0c1/attachment.html>

From pje at telecommunity.com  Mon Jul 26 01:03:51 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Sun, 25 Jul 2010 19:03:51 -0400
Subject: [Catalog-sig] Proposal to Reverse Ordering of Scraped Links in
 PyPI
In-Reply-To: <loom.20100726T003854-258@post.gmane.org>
References: <loom.20100726T003854-258@post.gmane.org>
Message-ID: <20100725230352.A3D6C3A4093@sparrow.telecommunity.com>

At 10:41 PM 7/25/2010 +0000, Armin Ronacher wrote:
>As a fix I would recommend just listing the latest links on the simple page or
>to reverse the order so that easy_install picks up the right one.

Detection order isn't part of the algorithm (at least for 
easy_install), so that won't actually help here.

The real issue here is that hidden links are shown for hidden 
releases on /simple pages at all.  Martin says that "users requested 
it", but IMO this is an instance where what users request isn't 
necessarily best.

If a user wants to access hidden versions of a package, they can 
always do so by direct targeting, the -f option, or the 
dependency_links setup() keyword.


From pje at telecommunity.com  Mon Jul 26 01:07:30 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Sun, 25 Jul 2010 19:07:30 -0400
Subject: [Catalog-sig] Proposal to Reverse Ordering of Scraped Links in
 PyPI
In-Reply-To: <AANLkTikAwxG0KU_CvHh32HXS9+tN2Tit3Jb-WqOsRA+Z@mail.gmail.c
 om>
References: <loom.20100726T003854-258@post.gmane.org>
	<AANLkTikAwxG0KU_CvHh32HXS9+tN2Tit3Jb-WqOsRA+Z@mail.gmail.com>
Message-ID: <20100725230736.867863A4093@sparrow.telecommunity.com>

At 11:53 PM 7/25/2010 +0100, Michael Foord wrote:
>Heh pip, distribute and setuptools *really* shouldn't be scraping 
>pypi - at least except as a last resort. pypi has an xml-rpc (and a 
>fledgling json api) which should be used in preference to scraping.

This is done to allow users to implement their own "PyPI clone" using 
nothing more than a static webserver or webpage.  XML-RPC would 
require a dynamic server.


>This *already* causes problems for pypi maintenance.

For some time now, easy_install uses the '/simple' index 
(specifically intended for automated tools' consumption), rather than 
the human-oriented pages.  Among other benefits, the /simple index 
can be served or mirrored statically, rather than being generated 
anew on each hit.


From armin.ronacher at active-4.com  Mon Jul 26 01:15:54 2010
From: armin.ronacher at active-4.com (Armin Ronacher)
Date: Mon, 26 Jul 2010 01:15:54 +0200
Subject: [Catalog-sig] Proposal to Reverse Ordering of Scraped Links in
 PyPI
In-Reply-To: <AANLkTikAwxG0KU_CvHh32HXS9+tN2Tit3Jb-WqOsRA+Z@mail.gmail.com>
References: <loom.20100726T003854-258@post.gmane.org>
	<AANLkTikAwxG0KU_CvHh32HXS9+tN2Tit3Jb-WqOsRA+Z@mail.gmail.com>
Message-ID: <4C4CC5AA.4050807@active-4.com>

Hi,

On 7/26/10 12:53 AM, Michael Foord wrote:
> Heh pip, distribute and setuptools *really* shouldn't be scraping pypi -
> at least except as a last resort. pypi has an xml-rpc (and a fledgling
> json api) which should be used in preference to scraping. This *already*
> causes problems for pypi maintenance.
I can see the value of it and until someone else is found to replace it, 
we should not tamper with that too much.


Regards,
Armin

From martin at v.loewis.de  Tue Jul 27 00:07:06 2010
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 27 Jul 2010 00:07:06 +0200
Subject: [Catalog-sig] Recent PyPI changes
Message-ID: <4C4E070A.1080205@v.loewis.de>

We (Richard Jones and me) made a number of changes to PyPI:
- there is now a way to request release information in JSON,
  see http://tinyurl.com/38lefsp
- it's possible to run the code base locally using sqlite,
  see the README.
- there is now demodata available (see README); people won't
  need a full database dump anymore to develop on the code.

In addition, pypi.appspot.com is likely to become mirror E
(perhaps B instead, so that E can have an A record).

Regards,
Martin

From chris at simplistix.co.uk  Tue Jul 27 10:52:56 2010
From: chris at simplistix.co.uk (Chris Withers)
Date: Tue, 27 Jul 2010 09:52:56 +0100
Subject: [Catalog-sig] Recent PyPI changes
In-Reply-To: <4C4E070A.1080205@v.loewis.de>
References: <4C4E070A.1080205@v.loewis.de>
Message-ID: <4C4E9E68.2040705@simplistix.co.uk>

Martin v. L?wis wrote:
> We (Richard Jones and me) made a number of changes to PyPI:
> - there is now a way to request release information in JSON,
>   see http://tinyurl.com/38lefsp
> - it's possible to run the code base locally using sqlite,
>   see the README.
> - there is now demodata available (see README); people won't
>   need a full database dump anymore to develop on the code.

All very cool :-)

> In addition, pypi.appspot.com is likely to become mirror E
> (perhaps B instead, so that E can have an A record).

I don't understand this...

What do E, B and A mean here?

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk

From pnasrat at google.com  Tue Jul 27 10:55:32 2010
From: pnasrat at google.com (Paul Nasrat)
Date: Tue, 27 Jul 2010 09:55:32 +0100
Subject: [Catalog-sig] Recent PyPI changes
In-Reply-To: <4C4E9E68.2040705@simplistix.co.uk>
References: <4C4E070A.1080205@v.loewis.de> <4C4E9E68.2040705@simplistix.co.uk>
Message-ID: <AANLkTik7G5OLh_0nQDZhQBsxBe2_jseTkyJVgBcEeqdV@mail.gmail.com>

On Tue, Jul 27, 2010 at 9:52 AM, Chris Withers <chris at simplistix.co.uk> wrote:
> Martin v. L?wis wrote:
>>
>> We (Richard Jones and me) made a number of changes to PyPI:
>> - there is now a way to request release information in JSON,
>> ?see http://tinyurl.com/38lefsp
>> - it's possible to run the code base locally using sqlite,
>> ?see the README.
>> - there is now demodata available (see README); people won't
>> ?need a full database dump anymore to develop on the code.
>
> All very cool :-)
>
>> In addition, pypi.appspot.com is likely to become mirror E
>> (perhaps B instead, so that E can have an A record).
>
> I don't understand this...
>
> What do E, B and A mean here?

The new pypi mirror schem has a naming scheme

X.pypi.python.org as documented in PEP 381

The values of X are the sequence a,b,c,...,aa,ab,... a.pypi.python.org
is the master server

Paul

From chris at simplistix.co.uk  Tue Jul 27 11:24:10 2010
From: chris at simplistix.co.uk (Chris Withers)
Date: Tue, 27 Jul 2010 10:24:10 +0100
Subject: [Catalog-sig] Recent PyPI changes
In-Reply-To: <AANLkTik7G5OLh_0nQDZhQBsxBe2_jseTkyJVgBcEeqdV@mail.gmail.com>
References: <4C4E070A.1080205@v.loewis.de>	<4C4E9E68.2040705@simplistix.co.uk>
	<AANLkTik7G5OLh_0nQDZhQBsxBe2_jseTkyJVgBcEeqdV@mail.gmail.com>
Message-ID: <4C4EA5BA.20702@simplistix.co.uk>

Paul Nasrat wrote:
> The new pypi mirror schem has a naming scheme
> 
> X.pypi.python.org as documented in PEP 381

PEP 381 looks awesome, how much of it is left to implement?

Chris


From ametaireau at gmail.com  Tue Jul 27 11:25:22 2010
From: ametaireau at gmail.com (Alexis Metaireau)
Date: Tue, 27 Jul 2010 11:25:22 +0200
Subject: [Catalog-sig] Recent PyPI changes
In-Reply-To: <4C4E9E68.2040705@simplistix.co.uk>
References: <4C4E070A.1080205@v.loewis.de> <4C4E9E68.2040705@simplistix.co.uk>
Message-ID: <1280222722.9978.6.camel@ecureuil>

On Tue, 2010-07-27 at 09:52 +0100, Chris Withers wrote:
> there is now a way to request release information in JSON,
> >   see http://tinyurl.com/38lefsp 
That's indeed cool, and useful, but we can't rely on this while
crawling, too bad this JSON is not replicated on the mirrors.

It could help a lot, since there is currently no way to request the
metadatas statically in others way that downloading the distribution
archives and extracting them. (we also could use xmlrpc, but that's not
static).

What's the process I have to follow in order to get this mirrored ? Does
that sounds good for you ? IOW, whats needed to have this as a
requirements for mirrors? 

Cheers, 
Alexis


From mal at egenix.com  Tue Jul 27 11:55:49 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 27 Jul 2010 11:55:49 +0200
Subject: [Catalog-sig] Recent PyPI changes
In-Reply-To: <1280222722.9978.6.camel@ecureuil>
References: <4C4E070A.1080205@v.loewis.de> <4C4E9E68.2040705@simplistix.co.uk>
	<1280222722.9978.6.camel@ecureuil>
Message-ID: <4C4EAD25.5060001@egenix.com>

Alexis Metaireau wrote:
> On Tue, 2010-07-27 at 09:52 +0100, Chris Withers wrote:
>> there is now a way to request release information in JSON,
>>>   see http://tinyurl.com/38lefsp 
> That's indeed cool, and useful, but we can't rely on this while
> crawling, too bad this JSON is not replicated on the mirrors.
> 
> It could help a lot, since there is currently no way to request the
> metadatas statically in others way that downloading the distribution
> archives and extracting them. (we also could use xmlrpc, but that's not
> static).
> 
> What's the process I have to follow in order to get this mirrored ? Does
> that sounds good for you ? IOW, whats needed to have this as a
> requirements for mirrors? 

Easiest would be to dump the complete release information
(PKG-INFO) to a text file using the name format <version>_pkg_info
in the simple/ index.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 27 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From mal at egenix.com  Tue Jul 27 13:31:41 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 27 Jul 2010 13:31:41 +0200
Subject: [Catalog-sig] Mirror list detection/construction - PEP 381
In-Reply-To: <4C483F34.5070503@v.loewis.de>
References: <AANLkTikiOJ12EJaqYw-_zid0qad7t-_c8CfXPjJz6Qur@mail.gmail.com>
	<4C483F34.5070503@v.loewis.de>
Message-ID: <4C4EC39D.3000708@egenix.com>

"Martin v. L?wis" wrote:
>> Thoughts?
> 
> I've been thinking that *.pypi.python.org should always
> yield A records, not CNAMEs.

+1

> It may be that this becomes difficult with Google appengine,
> though.

There's no rule without exception :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 27 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From merwok at netwok.org  Tue Jul 27 13:34:14 2010
From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Tue, 27 Jul 2010 13:34:14 +0200
Subject: [Catalog-sig] Recent PyPI changes
In-Reply-To: <4C4E070A.1080205@v.loewis.de>
References: <4C4E070A.1080205@v.loewis.de>
Message-ID: <4C4EC436.3060602@netwok.org>

> We (Richard Jones and me) made a number of changes to PyPI:

Great news, kudos!

Regards


From monitor at jacobian.org  Tue Jul 27 13:55:58 2010
From: monitor at jacobian.org (monitor at jacobian.org)
Date: Tue, 27 Jul 2010 06:55:58 -0500
Subject: [Catalog-sig] [monit] pypi.python.org - Connection succeeded
Message-ID: <1280231761.1@jacobian.org>

Connection succeeded Service pypi.python.org 

	Date:        Tue, 27 Jul 2010 06:55:58 -0500
	Action:      alert
	Host:        jacobian.org
	Description: connection succeeded to INET[pypi.python.org:80] via TCP

Your faithful employee,
monit


From pje at telecommunity.com  Tue Jul 27 18:12:37 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Tue, 27 Jul 2010 12:12:37 -0400
Subject: [Catalog-sig] Proposal to Reverse Ordering of Scraped Links in
 PyPI
In-Reply-To: <AANLkTikTE41hahFPYxteO0MF0R9DPo6kFVGNfeSUn6Mg@mail.gmail.c
 om>
References: <loom.20100726T003854-258@post.gmane.org>
	<AANLkTikAwxG0KU_CvHh32HXS9+tN2Tit3Jb-WqOsRA+Z@mail.gmail.com>
	<20100725230736.867863A4093@sparrow.telecommunity.com>
	<AANLkTikTE41hahFPYxteO0MF0R9DPo6kFVGNfeSUn6Mg@mail.gmail.com>
Message-ID: <20100727161236.EB5593A4093@sparrow.telecommunity.com>

At 12:21 PM 7/27/2010 +0200, Konrad Delong wrote:
> >> This *already* causes problems for pypi maintenance.
> >
> > For some time now, easy_install uses the '/simple' index (specifically
> > intended for automated tools' consumption), rather than the human-oriented
> > pages.  Among other benefits, the /simple index can be served or mirrored
> > statically, rather than being generated anew on each hit.
>
>To my understanding, simple index version encoding cannot be reliable
>(see [1], about this time: 2010-07-21T10:07:02 ).
>
>However, json interface sounds like a good trade-off here.
>
>Konrad
>
>
>[1] http://weblion.psu.edu/chatlogs/%23distutils/2010/07/21.txt

I don't understand what you mean by "version encoding cannot be 
reliable", even after reading the transcript above.

I did figure out that the problem you're talking about is with 
distutils2's PyPI support, and that further, distutils2's PyPI 
support is derived from easy_install's, but substantially changed in 
some areas.

So, it's quite possible that the changes broke things that the 
original in setuptools was doing, so you might want to check whether 
the problem you're experiencing also exists in setuptools.  (And I'd 
appreciate a bug report at the setuptools tracker if it does.)


From sridharr at activestate.com  Tue Jul 27 18:31:53 2010
From: sridharr at activestate.com (Sridhar Ratnakumar)
Date: Tue, 27 Jul 2010 09:31:53 -0700
Subject: [Catalog-sig] Recent PyPI changes
In-Reply-To: <4C4EAD25.5060001@egenix.com>
References: <4C4E070A.1080205@v.loewis.de> <4C4E9E68.2040705@simplistix.co.uk>
	<1280222722.9978.6.camel@ecureuil> <4C4EAD25.5060001@egenix.com>
Message-ID: <082652AC-F4BB-4243-AC3A-27F3EE4A06CA@activestate.com>


On 2010-07-27, at 2:55 AM, M.-A. Lemburg wrote:

> Alexis Metaireau wrote:
>> On Tue, 2010-07-27 at 09:52 +0100, Chris Withers wrote:
>>> there is now a way to request release information in JSON,
>>>>  see http://tinyurl.com/38lefsp 
>> That's indeed cool, and useful, but we can't rely on this while
>> crawling, too bad this JSON is not replicated on the mirrors.
>> 
>> It could help a lot, since there is currently no way to request the
>> metadatas statically in others way that downloading the distribution
>> archives and extracting them. (we also could use xmlrpc, but that's not
>> static).
>> 
>> What's the process I have to follow in order to get this mirrored ? Does
>> that sounds good for you ? IOW, whats needed to have this as a
>> requirements for mirrors? 
> 
> Easiest would be to dump the complete release information
> (PKG-INFO) to a text file using the name format <version>_pkg_info
> in the simple/ index.

What we ended up doing for our internal comprehensive Python package mirror is this:

- for pkg in changed_since_yesterday(pypi): download_source_using_easy_install(pkg) 
- extract PKG-INFO out of source
- extract 'requires.txt' (if it exists) out of source 

If you want to find the dependencies of a package, it can only be found in requires.txt (not PKG-INFO).

But then even if PKG-INFO/requires.txt is provided by /simple, keep in mind that it won't be comprehensive. Not all package authors use PyPI for serving their source distributions. (This is why we also had to use setuptools.package_index).

-srid

From sridharr at activestate.com  Tue Jul 27 18:44:40 2010
From: sridharr at activestate.com (Sridhar Ratnakumar)
Date: Tue, 27 Jul 2010 09:44:40 -0700
Subject: [Catalog-sig] Recent PyPI changes
In-Reply-To: <082652AC-F4BB-4243-AC3A-27F3EE4A06CA@activestate.com>
References: <4C4E070A.1080205@v.loewis.de> <4C4E9E68.2040705@simplistix.co.uk>
	<1280222722.9978.6.camel@ecureuil> <4C4EAD25.5060001@egenix.com>
	<082652AC-F4BB-4243-AC3A-27F3EE4A06CA@activestate.com>
Message-ID: <0387EF8E-049F-4B46-9485-EF221C9FF04F@activestate.com>


On 2010-07-27, at 9:31 AM, Sridhar Ratnakumar wrote:

> 
> On 2010-07-27, at 2:55 AM, M.-A. Lemburg wrote:
> 
>> Alexis Metaireau wrote:
>>> On Tue, 2010-07-27 at 09:52 +0100, Chris Withers wrote:
>>>> there is now a way to request release information in JSON,
>>>>> see http://tinyurl.com/38lefsp 
>>> That's indeed cool, and useful, but we can't rely on this while
>>> crawling, too bad this JSON is not replicated on the mirrors.
>>> 
>>> It could help a lot, since there is currently no way to request the
>>> metadatas statically in others way that downloading the distribution
>>> archives and extracting them. (we also could use xmlrpc, but that's not
>>> static).
>>> 
>>> What's the process I have to follow in order to get this mirrored ? Does
>>> that sounds good for you ? IOW, whats needed to have this as a
>>> requirements for mirrors? 
>> 
>> Easiest would be to dump the complete release information
>> (PKG-INFO) to a text file using the name format <version>_pkg_info
>> in the simple/ index.
> 
> What we ended up doing for our internal comprehensive Python package mirror is this:
> 
> - for pkg in changed_since_yesterday(pypi): download_source_using_easy_install(pkg) 
> - extract PKG-INFO out of source
> - extract 'requires.txt' (if it exists) out of source 
> 
> If you want to find the dependencies of a package, it can only be found in requires.txt (not PKG-INFO).
> 
> But then even if PKG-INFO/requires.txt is provided by /simple, keep in mind that it won't be comprehensive. Not all package authors use PyPI for serving their source distributions. (This is why we also had to use setuptools.package_index).

And, of course, not all source distributions include a PKG-INFO file, in which case it becomes mandatory to run "python setup.py egg_info" (after patching setup.py to do 'import setuptools') to generate it.  Twisted-10.1.0.tar.bz2 is one example of this.

There are still several other minor issues with metadata mirroring that I don't recall at the moment.

-srid

From ametaireau at gmail.com  Tue Jul 27 18:49:21 2010
From: ametaireau at gmail.com (Alexis Metaireau)
Date: Tue, 27 Jul 2010 18:49:21 +0200
Subject: [Catalog-sig] Proposal to Reverse Ordering of Scraped Links in
 PyPI
In-Reply-To: <20100727161236.EB5593A4093@sparrow.telecommunity.com>
References: <loom.20100726T003854-258@post.gmane.org>
	<AANLkTikAwxG0KU_CvHh32HXS9+tN2Tit3Jb-WqOsRA+Z@mail.gmail.com>
	<20100725230736.867863A4093@sparrow.telecommunity.com>
	<AANLkTikTE41hahFPYxteO0MF0R9DPo6kFVGNfeSUn6Mg@mail.gmail.com>
	<20100727161236.EB5593A4093@sparrow.telecommunity.com>
Message-ID: <1280249361.18626.9.camel@ecureuil>

On Tue, 2010-07-27 at 12:12 -0400, P.J. Eby wrote:
> At 12:21 PM 7/27/2010 +0200, Konrad Delong wrote:
> > >> This *already* causes problems for pypi maintenance.
> > >
> > > For some time now, easy_install uses the '/simple' index (specifically
> > > intended for automated tools' consumption), rather than the human-oriented
> > > pages.  Among other benefits, the /simple index can be served or mirrored
> > > statically, rather than being generated anew on each hit.
> >
> >To my understanding, simple index version encoding cannot be reliable
> >(see [1], about this time: 2010-07-21T10:07:02 ).
> >
> >However, json interface sounds like a good trade-off here.
> >
> >Konrad
> >
> >
> >[1] http://weblion.psu.edu/chatlogs/%23distutils/2010/07/21.txt

Konrad, what i was saying last time was about the way Michael Foord have
added -py* at the end of his archives, to try making them only retrieved
for python versions == py-*, IIRC.

For this, I think the simple index is not a good way to get this kind of
informations, as these python special versions distributions are not
easy to encode in an archive name, and as there is no defined shceme to
parse them in order to get this kind of informations, IIRC. It could be
for one python version, not for something like (python <= 2.4 and =<
2.7).

> So, it's quite possible that the changes broke things that the 
> original in setuptools was doing
I'd try to don't break things as possible :) 

Cheers, 
Alex


From pje at telecommunity.com  Tue Jul 27 19:51:15 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Tue, 27 Jul 2010 13:51:15 -0400
Subject: [Catalog-sig] Proposal to Reverse Ordering of Scraped Links in
 PyPI
In-Reply-To: <1280249361.18626.9.camel@ecureuil>
References: <loom.20100726T003854-258@post.gmane.org>
	<AANLkTikAwxG0KU_CvHh32HXS9+tN2Tit3Jb-WqOsRA+Z@mail.gmail.com>
	<20100725230736.867863A4093@sparrow.telecommunity.com>
	<AANLkTikTE41hahFPYxteO0MF0R9DPo6kFVGNfeSUn6Mg@mail.gmail.com>
	<20100727161236.EB5593A4093@sparrow.telecommunity.com>
	<1280249361.18626.9.camel@ecureuil>
Message-ID: <20100727175114.C70D13A4093@sparrow.telecommunity.com>

At 06:49 PM 7/27/2010 +0200, Alexis Metaireau wrote:
>On Tue, 2010-07-27 at 12:12 -0400, P.J. Eby wrote:
> > At 12:21 PM 7/27/2010 +0200, Konrad Delong wrote:
> > > >> This *already* causes problems for pypi maintenance.
> > > >
> > > > For some time now, easy_install uses the '/simple' index (specifically
> > > > intended for automated tools' consumption), rather than the 
> human-oriented
> > > > pages.  Among other benefits, the /simple index can be served 
> or mirrored
> > > > statically, rather than being generated anew on each hit.
> > >
> > >To my understanding, simple index version encoding cannot be reliable
> > >(see [1], about this time: 2010-07-21T10:07:02 ).
> > >
> > >However, json interface sounds like a good trade-off here.
> > >
> > >Konrad
> > >
> > >
> > >[1] http://weblion.psu.edu/chatlogs/%23distutils/2010/07/21.txt
>
>Konrad, what i was saying last time was about the way Michael Foord have
>added -py* at the end of his archives, to try making them only retrieved
>for python versions == py-*, IIRC.

Ah.  easy_install doesn't support that either, it just doesn't result 
in an error, because it assumes that 
<http://pypi.python.org/packages/source/u/unittest2/unittest2-0.5.0-py2.3.tar.gz#md5=c4722438ea5f3f327082c29e9d5ad25a>unittest2-0.5.0-py2.3.tar.gz 
is either

1) the '0.5.0-py2.3' version of unittest2, or
2) the 'py2.3' version of 'unittest2-0.5.0'

And then selects one based on whether you're looking for a project 
named 'unittest2' or 'unittest2-0.5.0'.

Distutils2, OTOH, is failing because it can't process '0.5.0-py2.3' 
as a version number without more heuristics in its "rational 
suggestion" algorithm.


From martin at v.loewis.de  Tue Jul 27 21:35:06 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 27 Jul 2010 21:35:06 +0200
Subject: [Catalog-sig] Recent PyPI changes
In-Reply-To: <4C4E9E68.2040705@simplistix.co.uk>
References: <4C4E070A.1080205@v.loewis.de> <4C4E9E68.2040705@simplistix.co.uk>
Message-ID: <4C4F34EA.50606@v.loewis.de>

Am 27.07.2010 10:52, schrieb Chris Withers:
> Martin v. L?wis wrote:
>> We (Richard Jones and me) made a number of changes to PyPI:
>> - there is now a way to request release information in JSON,
>>   see http://tinyurl.com/38lefsp
>> - it's possible to run the code base locally using sqlite,
>>   see the README.
>> - there is now demodata available (see README); people won't
>>   need a full database dump anymore to develop on the code.
> 
> All very cool :-)
> 
>> In addition, pypi.appspot.com is likely to become mirror E
>> (perhaps B instead, so that E can have an A record).
> 
> I don't understand this...
> 
> What do E, B and A mean here?

Paul already explained most of it. An A record is a database
entry in the Domain Name System, mapping a name to an IP address.

Regards,
Martin

From martin at v.loewis.de  Tue Jul 27 21:35:37 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 27 Jul 2010 21:35:37 +0200
Subject: [Catalog-sig] Recent PyPI changes
In-Reply-To: <4C4EA5BA.20702@simplistix.co.uk>
References: <4C4E070A.1080205@v.loewis.de>	<4C4E9E68.2040705@simplistix.co.uk>
	<AANLkTik7G5OLh_0nQDZhQBsxBe2_jseTkyJVgBcEeqdV@mail.gmail.com>
	<4C4EA5BA.20702@simplistix.co.uk>
Message-ID: <4C4F3509.20904@v.loewis.de>

Am 27.07.2010 11:24, schrieb Chris Withers:
> Paul Nasrat wrote:
>> The new pypi mirror schem has a naming scheme
>>
>> X.pypi.python.org as documented in PEP 381
> 
> PEP 381 looks awesome, how much of it is left to implement?

Nothing that I know of. Of course, clients now need to learn to use it.

Regards,
Martin

From martin at v.loewis.de  Tue Jul 27 21:38:54 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 27 Jul 2010 21:38:54 +0200
Subject: [Catalog-sig] Recent PyPI changes
In-Reply-To: <1280222722.9978.6.camel@ecureuil>
References: <4C4E070A.1080205@v.loewis.de> <4C4E9E68.2040705@simplistix.co.uk>
	<1280222722.9978.6.camel@ecureuil>
Message-ID: <4C4F35CE.7030403@v.loewis.de>

Am 27.07.2010 11:25, schrieb Alexis Metaireau:
> On Tue, 2010-07-27 at 09:52 +0100, Chris Withers wrote:
>> there is now a way to request release information in JSON,
>>>   see http://tinyurl.com/38lefsp 
> That's indeed cool, and useful, but we can't rely on this while
> crawling, too bad this JSON is not replicated on the mirrors.
> 
> It could help a lot, since there is currently no way to request the
> metadatas statically in others way that downloading the distribution
> archives and extracting them. (we also could use xmlrpc, but that's not
> static).

That's not true - you can also use the DOAP records.

> 
> What's the process I have to follow in order to get this mirrored ? Does
> that sounds good for you ? IOW, whats needed to have this as a
> requirements for mirrors? 

Discuss on catalog-sig, then propose a change to PEP 381.

Notice it is out of scope for the objective of the PEP, though, which
is to support continued operation of setuptools-style tools in case of
a PyPI outage. I'm not sure what kind of application you have in mind
where replication of JSON files would help in case of some failure.

Regards,
Martin

From martin at v.loewis.de  Tue Jul 27 22:25:58 2010
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 27 Jul 2010 22:25:58 +0200
Subject: [Catalog-sig] PyPI reverse download
Message-ID: <4C4F40D6.9090704@v.loewis.de>

I'll be implementing a feature for PyPI where you can POST
to a certain action (revdownload), and then PyPI will POST
the file requested to an URL that was passed; this is need
to make blobs work on AppEngine.

Any objections?

Regards,
Martin

From jacob at jacobian.org  Tue Jul 27 22:34:31 2010
From: jacob at jacobian.org (Jacob Kaplan-Moss)
Date: Tue, 27 Jul 2010 13:34:31 -0700
Subject: [Catalog-sig] PyPI reverse download
In-Reply-To: <4C4F40D6.9090704@v.loewis.de>
References: <4C4F40D6.9090704@v.loewis.de>
Message-ID: <AANLkTin91GUCLKr5SefV1VKsNTOYzMnB0EJqT_Y0czZT@mail.gmail.com>

On Tue, Jul 27, 2010 at 1:25 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> I'll be implementing a feature for PyPI where you can POST
> to a certain action (revdownload), and then PyPI will POST
> the file requested to an URL that was passed; this is need
> to make blobs work on AppEngine.
>
> Any objections?

Seems like this is rife for abuse -- it's essentially an open relay
for POST requests, so I could use it to amplify a DDOS attack. So
probably sounds like there needs to be some sort of security, or
whitelist of allowed URL (or prefixes?), or somesuch.

Jacob

From mal at egenix.com  Tue Jul 27 22:46:02 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 27 Jul 2010 22:46:02 +0200
Subject: [Catalog-sig] PyPI reverse download
In-Reply-To: <4C4F40D6.9090704@v.loewis.de>
References: <4C4F40D6.9090704@v.loewis.de>
Message-ID: <4C4F458A.5070300@egenix.com>

"Martin v. L?wis" wrote:
> I'll be implementing a feature for PyPI where you can POST
> to a certain action (revdownload), and then PyPI will POST
> the file requested to an URL that was passed; this is need
> to make blobs work on AppEngine.
> 
> Any objections?

Could you provide more detail on how this would work and why
this is needed for AppEngine ?

Having the PyPI server (or any other server running the software)
open connections to arbitrary hosts under control of a 3rd party
does not sound like a good idea, but I guess I'm just missing
some detail :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 27 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From martin at v.loewis.de  Tue Jul 27 22:52:25 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 27 Jul 2010 22:52:25 +0200
Subject: [Catalog-sig] PyPI reverse download
In-Reply-To: <AANLkTin91GUCLKr5SefV1VKsNTOYzMnB0EJqT_Y0czZT@mail.gmail.com>
References: <4C4F40D6.9090704@v.loewis.de>
	<AANLkTin91GUCLKr5SefV1VKsNTOYzMnB0EJqT_Y0czZT@mail.gmail.com>
Message-ID: <4C4F4709.8090602@v.loewis.de>

>> Any objections?
> 
> Seems like this is rife for abuse -- it's essentially an open relay
> for POST requests, so I could use it to amplify a DDOS attack. So
> probably sounds like there needs to be some sort of security, or
> whitelist of allowed URL (or prefixes?), or somesuch.

I guess I restrict it to posting to *.python.org, then.

Thanks,
Martin

From noah at coderanger.net  Tue Jul 27 22:55:36 2010
From: noah at coderanger.net (Noah Kantrowitz)
Date: Tue, 27 Jul 2010 13:55:36 -0700
Subject: [Catalog-sig] PyPI reverse download
In-Reply-To: <4C4F40D6.9090704@v.loewis.de>
References: <4C4F40D6.9090704@v.loewis.de>
Message-ID: <00b101cb2dce$120ff660$362fe320$@net>

There are other ways to do this on GAE (was actually looking into this a few
days ago). I can send some reference links when I get home from work.

--Noah

> -----Original Message-----
> From: catalog-sig-bounces+noah=coderanger.net at python.org
> [mailto:catalog-sig-bounces+noah=coderanger.net at python.org] On Behalf
> Of "Martin v. L?wis"
> Sent: Tuesday, July 27, 2010 1:26 PM
> To: catalog-sig
> Subject: [Catalog-sig] PyPI reverse download
> 
> I'll be implementing a feature for PyPI where you can POST
> to a certain action (revdownload), and then PyPI will POST
> the file requested to an URL that was passed; this is need
> to make blobs work on AppEngine.
> 
> Any objections?
> 
> Regards,
> Martin
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig


From martin at v.loewis.de  Tue Jul 27 23:06:30 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 27 Jul 2010 23:06:30 +0200
Subject: [Catalog-sig] PyPI reverse download
In-Reply-To: <4C4F458A.5070300@egenix.com>
References: <4C4F40D6.9090704@v.loewis.de> <4C4F458A.5070300@egenix.com>
Message-ID: <4C4F4A56.1040203@v.loewis.de>

Am 27.07.2010 22:46, schrieb M.-A. Lemburg:
> "Martin v. L?wis" wrote:
>> I'll be implementing a feature for PyPI where you can POST
>> to a certain action (revdownload), and then PyPI will POST
>> the file requested to an URL that was passed; this is need
>> to make blobs work on AppEngine.
>>
>> Any objections?
> 
> Could you provide more detail on how this would work and why
> this is needed for AppEngine ?

Ok, here is the long story.

First, I tried to use the approach of pypione, using blobs for
distributions. That won't work because blobs are limited to 1MB.

Then I tried using lists of blobs instead. That won't work because
the HTTP response size in urlfetch is limited to 10GB.

Then I tried using Range: headers to mirror large files in pieces.
That won't work because I then wouldn't be able to serve the files
to setuptools, unless that would also start to use Range: headers.

The only way to serve files larger than 10GB is the blobstore.

However, apps can neither read from nor write to the blobstore.
The only way to read from it is to serve the file, and the only
way to write to it is through a POST from the outside.

BTW, Google has kindly granted the app access to the blobstore
(which is a for-fees feature only), and also kindly increased
the store quota (which is 1GB in the free service, when a PyPI
mirror needs about 15GB).

Regards,
Martin

From noah at coderanger.net  Tue Jul 27 23:38:15 2010
From: noah at coderanger.net (Noah Kantrowitz)
Date: Tue, 27 Jul 2010 14:38:15 -0700
Subject: [Catalog-sig] PyPI reverse download
In-Reply-To: <4C4F4A56.1040203@v.loewis.de>
References: <4C4F40D6.9090704@v.loewis.de> <4C4F458A.5070300@egenix.com>
	<4C4F4A56.1040203@v.loewis.de>
Message-ID: <00b401cb2dd4$05de1bf0$119a53d0$@net>



> -----Original Message-----
> From: catalog-sig-bounces+noah=coderanger.net at python.org
> [mailto:catalog-sig-bounces+noah=coderanger.net at python.org] On Behalf
> Of "Martin v. L?wis"
> Sent: Tuesday, July 27, 2010 2:07 PM
> To: M.-A. Lemburg
> Cc: catalog-sig
> Subject: Re: [Catalog-sig] PyPI reverse download
> 
> Am 27.07.2010 22:46, schrieb M.-A. Lemburg:
> > "Martin v. L?wis" wrote:
> >> I'll be implementing a feature for PyPI where you can POST
> >> to a certain action (revdownload), and then PyPI will POST
> >> the file requested to an URL that was passed; this is need
> >> to make blobs work on AppEngine.
> >>
> >> Any objections?
> >
> > Could you provide more detail on how this would work and why
> > this is needed for AppEngine ?
> 
> Ok, here is the long story.
> 
> First, I tried to use the approach of pypione, using blobs for
> distributions. That won't work because blobs are limited to 1MB.
> 
> Then I tried using lists of blobs instead. That won't work because
> the HTTP response size in urlfetch is limited to 10GB.
> 
> Then I tried using Range: headers to mirror large files in pieces.
> That won't work because I then wouldn't be able to serve the files
> to setuptools, unless that would also start to use Range: headers.
> 
> The only way to serve files larger than 10GB is the blobstore.
> 
> However, apps can neither read from nor write to the blobstore.
> The only way to read from it is to serve the file, and the only
> way to write to it is through a POST from the outside.
> 
> BTW, Google has kindly granted the app access to the blobstore
> (which is a for-fees feature only), and also kindly increased
> the store quota (which is 1GB in the free service, when a PyPI
> mirror needs about 15GB).

I've been able to write to it using urlfetch internally. You just craft a
request with the POST from the app and send it to the URL you get from the
blobstore API.

--Noah


From martin at v.loewis.de  Tue Jul 27 23:58:34 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 27 Jul 2010 23:58:34 +0200
Subject: [Catalog-sig] PyPI reverse download
In-Reply-To: <00b401cb2dd4$05de1bf0$119a53d0$@net>
References: <4C4F40D6.9090704@v.loewis.de>
	<4C4F458A.5070300@egenix.com>	<4C4F4A56.1040203@v.loewis.de>
	<00b401cb2dd4$05de1bf0$119a53d0$@net>
Message-ID: <4C4F568A.8010008@v.loewis.de>

>> However, apps can neither read from nor write to the blobstore.
>> The only way to read from it is to serve the file, and the only
>> way to write to it is through a POST from the outside.
>>
> I've been able to write to it using urlfetch internally. You just craft a
> request with the POST from the app and send it to the URL you get from the
> blobstore API.

Won't that fail if the request is larger then 1MB?

http://code.google.com/intl/de/appengine/docs/python/urlfetch/overview.html#Quotas_and_Limits

Regards,
Martin

From mal at egenix.com  Wed Jul 28 00:06:11 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 28 Jul 2010 00:06:11 +0200
Subject: [Catalog-sig] PyPI reverse download
In-Reply-To: <4C4F4A56.1040203@v.loewis.de>
References: <4C4F40D6.9090704@v.loewis.de> <4C4F458A.5070300@egenix.com>
	<4C4F4A56.1040203@v.loewis.de>
Message-ID: <4C4F5853.7000700@egenix.com>

"Martin v. L?wis" wrote:
> Am 27.07.2010 22:46, schrieb M.-A. Lemburg:
>> "Martin v. L?wis" wrote:
>>> I'll be implementing a feature for PyPI where you can POST
>>> to a certain action (revdownload), and then PyPI will POST
>>> the file requested to an URL that was passed; this is need
>>> to make blobs work on AppEngine.
>>>
>>> Any objections?
>>
>> Could you provide more detail on how this would work and why
>> this is needed for AppEngine ?
> 
> Ok, here is the long story.
> 
> First, I tried to use the approach of pypione, using blobs for
> distributions. That won't work because blobs are limited to 1MB.
> 
> Then I tried using lists of blobs instead. That won't work because
> the HTTP response size in urlfetch is limited to 10GB.
> 
> Then I tried using Range: headers to mirror large files in pieces.
> That won't work because I then wouldn't be able to serve the files
> to setuptools, unless that would also start to use Range: headers.
> 
> The only way to serve files larger than 10GB is the blobstore.
> 
> However, apps can neither read from nor write to the blobstore.
> The only way to read from it is to serve the file, and the only
> way to write to it is through a POST from the outside.
> 
> BTW, Google has kindly granted the app access to the blobstore
> (which is a for-fees feature only), and also kindly increased
> the store quota (which is 1GB in the free service, when a PyPI
> mirror needs about 15GB).

Thanks for the details.

One aspect I still don't understand is why you'd want to upload
the whole PyPI mirror image in one go. Wouldn't it be better
to just upload the distribution files separately ? (I don't
think any of those is more than a few 10MB in size)

Another aspect I don't (yet) understand is why these uploads
would have to be initiated from outside the main PyPI server.

I suspect that you want to use this feature to sync an
AppEngine mirror with the data on the main server. For that,
you'd only need to be able to upload data from that one
server to the AppEngine blobstore. This should be possible
without any external request to a PyPI RPC interface, simply
via a script run via a cronjob or perhaps triggered by
a new distribution file upload.

The Amazon Cloudfront mirror would essentially work in the same
way.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 27 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From martin at v.loewis.de  Wed Jul 28 00:23:46 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 28 Jul 2010 00:23:46 +0200
Subject: [Catalog-sig] PyPI reverse download
In-Reply-To: <4C4F5853.7000700@egenix.com>
References: <4C4F40D6.9090704@v.loewis.de> <4C4F458A.5070300@egenix.com>
	<4C4F4A56.1040203@v.loewis.de> <4C4F5853.7000700@egenix.com>
Message-ID: <4C4F5C72.4060902@v.loewis.de>

> One aspect I still don't understand is why you'd want to upload
> the whole PyPI mirror image in one go.

Of course I don't want that. That's 14GB, and it would be silly
to upload it all at once.

> Wouldn't it be better
> to just upload the distribution files separately ? (I don't
> think any of those is more than a few 10MB in size)

The largest file is 20MB, and the limit in the GAE url fetcher
for the response size is 1MB. So I can't fetch some of the files.

> Another aspect I don't (yet) understand is why these uploads
> would have to be initiated from outside the main PyPI server.

I want to initiate them from the mirror.

Regards,
Martin

From ametaireau at gmail.com  Wed Jul 28 09:30:38 2010
From: ametaireau at gmail.com (Alexis Metaireau)
Date: Wed, 28 Jul 2010 09:30:38 +0200
Subject: [Catalog-sig] Proposal to Reverse Ordering of Scraped Links in
 PyPI
In-Reply-To: <20100727175114.C70D13A4093@sparrow.telecommunity.com>
References: <loom.20100726T003854-258@post.gmane.org>
	<AANLkTikAwxG0KU_CvHh32HXS9+tN2Tit3Jb-WqOsRA+Z@mail.gmail.com>
	<20100725230736.867863A4093@sparrow.telecommunity.com>
	<AANLkTikTE41hahFPYxteO0MF0R9DPo6kFVGNfeSUn6Mg@mail.gmail.com>
	<20100727161236.EB5593A4093@sparrow.telecommunity.com>
	<1280249361.18626.9.camel@ecureuil>
	<20100727175114.C70D13A4093@sparrow.telecommunity.com>
Message-ID: <1280302238.1807.2.camel@ecureuil>

On Tue, 2010-07-27 at 13:51 -0400, P.J. Eby wrote:
> Ah.  easy_install doesn't support that either, it just doesn't result 
> in an error, because it assumes that 
> <http://pypi.python.org/packages/source/u/unittest2/unittest2-0.5.0-py2.3.tar.gz#md5=c4722438ea5f3f327082c29e9d5ad25a>unittest2-0.5.0-py2.3.tar.gz 
> is either
> 
> 1) the '0.5.0-py2.3' version of unittest2, or
> 2) the 'py2.3' version of 'unittest2-0.5.0'
> 
> And then selects one based on whether you're looking for a project 
> named 'unittest2' or 'unittest2-0.5.0'.
> 
> Distutils2, OTOH, is failing because it can't process '0.5.0-py2.3' 
> as a version number without more heuristics in its "rational 
> suggestion" algorithm.
Exactly !



From monitor at jacobian.org  Thu Jul 29 14:36:29 2010
From: monitor at jacobian.org (monitor at jacobian.org)
Date: Thu, 29 Jul 2010 07:36:29 -0500
Subject: [Catalog-sig] [monit] pypi.python.org - Connection failed
Message-ID: <1280406992.1@jacobian.org>

Connection failed Service pypi.python.org 

	Date:        Thu, 29 Jul 2010 07:36:29 -0500
	Action:      alert
	Host:        jacobian.org
	Description: failed protocol test [HTTP] at INET[pypi.python.org:80] via TCP

Your faithful employee,
monit


From ametaireau at gmail.com  Thu Jul 29 15:05:03 2010
From: ametaireau at gmail.com (Alexis Metaireau)
Date: Thu, 29 Jul 2010 15:05:03 +0200
Subject: [Catalog-sig] [monit] pypi.python.org - Connection failed
In-Reply-To: <1280406992.1@jacobian.org>
References: <1280406992.1@jacobian.org>
Message-ID: <1280408703.9848.4.camel@ecureuil>

On Thu, 2010-07-29 at 07:36 -0500, monitor at jacobian.org wrote:
> Connection failed Service pypi.python.org 

I've enjoyed this occasion to check in real situation if the mirroring
infrastructure for distutils2 "simple" crawler works, and, it just
works? ! (eg. the client switch automatically of mirror if one seems to
be down).

Yay !

Cheers, 
Alexis


From monitor at jacobian.org  Thu Jul 29 15:34:45 2010
From: monitor at jacobian.org (monitor at jacobian.org)
Date: Thu, 29 Jul 2010 08:34:45 -0500
Subject: [Catalog-sig] [monit] pypi.python.org - Connection succeeded
Message-ID: <1280410487.1@jacobian.org>

Connection succeeded Service pypi.python.org 

	Date:        Thu, 29 Jul 2010 08:34:45 -0500
	Action:      alert
	Host:        jacobian.org
	Description: connection succeeded to INET[pypi.python.org:80] via TCP

Your faithful employee,
monit


From pje at telecommunity.com  Thu Jul 29 19:51:27 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Thu, 29 Jul 2010 13:51:27 -0400
Subject: [Catalog-sig] Suggested change to /simple index
Message-ID: <20100729175116.A74703A4114@sparrow.telecommunity.com>

Recently, a proposal was made to change the sorting of links on 
PyPI's /simple  index to prevent problems with easy_install finding 
out-of-date non-PyPI download links.  That proposal, unfortunately, 
would not have solved the actual problem.

After giving it some thought, I have an alternative proposal, that I 
think *would* solve the problem, and work for all scraping tools 
using the /simple index, not just easy_install.

Essentially, the problem is that when links to "hidden" versions were 
added to the /simple index (to satisfy users wanting to be able to 
download older versions' distributions), in-description and 
home/download page links were included.  However, if a package's home 
page URL or revision control download links change over time, the 
older ones still show up in the /simple listing, leading to ambiguity 
for download tools.

However, since the actual use case for which this was added was only 
to support reaching specific older versions of a project, it isn't 
actually necessary to include links that aren't to downloadable files 
with a specific version number.

Say package Foo releases version 1.1, causing 1.0 to become 
hidden.  People still want to be able to download the 1.0's .tgz's or 
.rpm's or what-have-you's.  However, they do *not* still need to be 
able to access the project's older, now-defunct home page, or any of 
the extra links included in the older version's description.

It is these extraneous links that cause the problem, not the access 
to PyPI-hosted archives.

Now, it could be argued that if a project used its "download" or 
"home page" link (or even in-description links) to point to actual 
archives, and if that is the case, then older links would be lost by 
omitting such links for "hidden" versions.  However, if that's really 
a problem, it could be remedied by simply checking whether the URL 
contains a file extension, or a revision number, or something like that.

However, since the original request to access hidden versions was 
aimed squarely at PyPI-hosted downloads, the original use case could 
still be met simply by only including PyPI-hosted links for "hidden" 
releases, thereby insuring that other links are only shown for 
"current" versions -- i.e., ones that package authors would expect 
are the only versions whose home/download/description links would 
need to be kept up-to-date on.

Making such a change would immediately fix many problematic/ambiguous 
links in the /simple index, where out-of-date or no-longer available 
links are shown.  (It would also fix the security issue whereby 
someone acquiring a no-longer-in-service URL could link it to trojan downloads.)


From jim at zope.com  Thu Jul 29 21:00:05 2010
From: jim at zope.com (Jim Fulton)
Date: Thu, 29 Jul 2010 15:00:05 -0400
Subject: [Catalog-sig] Suggested change to /simple index
In-Reply-To: <20100729175116.A74703A4114@sparrow.telecommunity.com>
References: <20100729175116.A74703A4114@sparrow.telecommunity.com>
Message-ID: <AANLkTikV43ndhyLt3=Py_uB8T+UqZmnGwyTBKNmjgm49@mail.gmail.com>

On Thu, Jul 29, 2010 at 1:51 PM, P.J. Eby <pje at telecommunity.com> wrote:
> Recently, a proposal was made to change the sorting of links on PyPI's
> /simple ?index to prevent problems with easy_install finding out-of-date
> non-PyPI download links. ?That proposal, unfortunately, would not have
> solved the actual problem.
>
> After giving it some thought, I have an alternative proposal, that I think
> *would* solve the problem, and work for all scraping tools using the /simple
> index, not just easy_install.
>
> Essentially, the problem is that when links to "hidden" versions were added
> to the /simple index (to satisfy users wanting to be able to download older
> versions' distributions), in-description and home/download page links were
> included. ?However, if a package's home page URL or revision control
> download links change over time, the older ones still show up in the /simple
> listing, leading to ambiguity for download tools.
>
> However, since the actual use case for which this was added was only to
> support reaching specific older versions of a project, it isn't actually
> necessary to include links that aren't to downloadable files with a specific
> version number.
>
> Say package Foo releases version 1.1, causing 1.0 to become hidden. ?People
> still want to be able to download the 1.0's .tgz's or .rpm's or
> what-have-you's. ?However, they do *not* still need to be able to access the
> project's older, now-defunct home page, or any of the extra links included
> in the older version's description.
>
> It is these extraneous links that cause the problem, not the access to
> PyPI-hosted archives.
>
> Now, it could be argued that if a project used its "download" or "home page"
> link (or even in-description links) to point to actual archives, and if that
> is the case, then older links would be lost by omitting such links for
> "hidden" versions. ?However, if that's really a problem, it could be
> remedied by simply checking whether the URL contains a file extension, or a
> revision number, or something like that.
>
> However, since the original request to access hidden versions was aimed
> squarely at PyPI-hosted downloads, the original use case could still be met
> simply by only including PyPI-hosted links for "hidden" releases, thereby
> insuring that other links are only shown for "current" versions -- i.e.,
> ones that package authors would expect are the only versions whose
> home/download/description links would need to be kept up-to-date on.
>
> Making such a change would immediately fix many problematic/ambiguous links
> in the /simple index, where out-of-date or no-longer available links are
> shown. ?(It would also fix the security issue whereby someone acquiring a
> no-longer-in-service URL could link it to trojan downloads.)

+1

Jim

-- 
Jim Fulton

From vernondcole at gmail.com  Thu Jul 29 21:58:18 2010
From: vernondcole at gmail.com (Vernon Cole)
Date: Thu, 29 Jul 2010 13:58:18 -0600
Subject: [Catalog-sig] Ownership change request.
Message-ID: <AANLkTinTOEzJwQTAGovEXu3xVCApb3b5XrxFLj7RBrDW@mail.gmail.com>

Newbie alert -- this may be a dumb question.

I hope that this is the correct place to make this request.  Please forward
or refer me elsewhere if not.

I took over maintenance of the adodbapi package on sourceforge several years
ago after Henrik Ekelund and Jim Abrams disappeared from view.  I am now
creating a distutils setup.py for the package, and would like to update the
entries on PyPI. I cannot do that because Henrik (username ekelund) is the
owner.

Can I somehow get read/write access to (or ownership of) the listing for
this product?  The information is getting very obsolete.
--
Vernon Cole
python.org user name "kf7xm"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20100729/8283f714/attachment.html>

From mal at egenix.com  Thu Jul 29 23:10:07 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 29 Jul 2010 23:10:07 +0200
Subject: [Catalog-sig] Ownership change request.
In-Reply-To: <AANLkTinTOEzJwQTAGovEXu3xVCApb3b5XrxFLj7RBrDW@mail.gmail.com>
References: <AANLkTinTOEzJwQTAGovEXu3xVCApb3b5XrxFLj7RBrDW@mail.gmail.com>
Message-ID: <4C51EE2F.4060108@egenix.com>

Vernon Cole wrote:
> Newbie alert -- this may be a dumb question.
> 
> I hope that this is the correct place to make this request.  Please forward
> or refer me elsewhere if not.
> 
> I took over maintenance of the adodbapi package on sourceforge several years
> ago after Henrik Ekelund and Jim Abrams disappeared from view.  I am now
> creating a distutils setup.py for the package, and would like to update the
> entries on PyPI. I cannot do that because Henrik (username ekelund) is the
> owner.
> 
> Can I somehow get read/write access to (or ownership of) the listing for
> this product?  The information is getting very obsolete.

I think you have to file a PyPI support request for this using the
PyPI tracker.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 29 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From martin at v.loewis.de  Thu Jul 29 23:24:17 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 29 Jul 2010 23:24:17 +0200
Subject: [Catalog-sig] Ownership change request.
In-Reply-To: <4C51EE2F.4060108@egenix.com>
References: <AANLkTinTOEzJwQTAGovEXu3xVCApb3b5XrxFLj7RBrDW@mail.gmail.com>
	<4C51EE2F.4060108@egenix.com>
Message-ID: <4C51F181.3000108@v.loewis.de>

Am 29.07.2010 23:10, schrieb M.-A. Lemburg:
> Vernon Cole wrote:
>> Newbie alert -- this may be a dumb question.
>>
>> I hope that this is the correct place to make this request.  Please forward
>> or refer me elsewhere if not.
>>
>> I took over maintenance of the adodbapi package on sourceforge several years
>> ago after Henrik Ekelund and Jim Abrams disappeared from view.  I am now
>> creating a distutils setup.py for the package, and would like to update the
>> entries on PyPI. I cannot do that because Henrik (username ekelund) is the
>> owner.
>>
>> Can I somehow get read/write access to (or ownership of) the listing for
>> this product?  The information is getting very obsolete.
> 
> I think you have to file a PyPI support request for this using the
> PyPI tracker.

That would be appreciated. If I can find the time to do the necessary
research, and don't forget about it, I might be able to help without.

Regards,
Martin

From monitor at jacobian.org  Fri Jul 30 06:26:09 2010
From: monitor at jacobian.org (monitor at jacobian.org)
Date: Thu, 29 Jul 2010 23:26:09 -0500
Subject: [Catalog-sig] [monit] pypi.python.org - Connection failed
Message-ID: <1280463972.1@jacobian.org>

Connection failed Service pypi.python.org 

	Date:        Thu, 29 Jul 2010 23:26:09 -0500
	Action:      alert
	Host:        jacobian.org
	Description: failed protocol test [HTTP] at INET[pypi.python.org:80] via TCP

Your faithful employee,
monit


From monitor at jacobian.org  Fri Jul 30 06:59:52 2010
From: monitor at jacobian.org (monitor at jacobian.org)
Date: Thu, 29 Jul 2010 23:59:52 -0500
Subject: [Catalog-sig] [monit] pypi.python.org - Connection succeeded
Message-ID: <1280465994.1@jacobian.org>

Connection succeeded Service pypi.python.org 

	Date:        Thu, 29 Jul 2010 23:59:52 -0500
	Action:      alert
	Host:        jacobian.org
	Description: connection succeeded to INET[pypi.python.org:80] via TCP

Your faithful employee,
monit