Utility to mirror an egg index from a Debian distribution

Hi, It may seem like a backwards way of doing things, but I have a need for a utility that can maintain a python package index mirror of a Debian repository. The basic idea is to extract the tarballs of Python packages from the Debian repository and rename them to the original setuptools name. It should also create a buildout-compatible versions file of the versions in the repository. My current implementation idea is to unpack the tarball and use the egg-metadata to figure out what the "egg" name of the tarball should be. Does such a tool exists? If not, I'll probably start working on one on svn.zope.org Real Soon Now (TM). Comments, suggestions much appreciated! -- Brian Sutherland

At 01:00 PM 5/15/2009 +0200, Brian Sutherland wrote:
Hi,
It may seem like a backwards way of doing things, but I have a need for a utility that can maintain a python package index mirror of a Debian repository.
The basic idea is to extract the tarballs of Python packages from the Debian repository and rename them to the original setuptools name. It should also create a buildout-compatible versions file of the versions in the repository.
My current implementation idea is to unpack the tarball and use the egg-metadata to figure out what the "egg" name of the tarball should be.
Running "setup.py --name --version" will dump out the name and version, whether you use distutils or setuptools. If you want a setuptools-compatible name and version, you'll need to postprocess those strings with pkg_resources.safe_name() and safe_version(), then escape them with to_filename() if you're using them as components of a sdist or egg filename.

Hi, I'm using working set and I have come into an issue. In my package manager I can now install packages and deinstall fine. The problem that I now face is want to refresh WorkingSet after a package has been deinstalled. Or, I need to do the refresh when swapping to a different version of python that is installed on the system. Code looks something like this: def installed_packages(self): result = [] if not self.interpreted: site.addsitedir(self.python_sitepackages_path) import pkg_resources ws = pkg_resources.WorkingSet() for i in ws: s = str(i) result.append(s.split(' ')) return result

At 10:24 PM 5/16/2009 -0400, David Lyon wrote:
Hi,
I'm using working set and I have come into an issue.
In my package manager I can now install packages and deinstall fine.
The problem that I now face is want to refresh WorkingSet after a package has been deinstalled.
Or, I need to do the refresh when swapping to a different version of python that is installed on the system.
Just create a fresh WorkingSet object, and use that one. You probably want to be using explicitly-created WorkingSet instances for this anyway. (easy_install certainly does.)

On Sat, 16 May 2009 22:36:13 -0400, "P.J. Eby" <pje@telecommunity.com> wrote:
Just create a fresh WorkingSet object, and use that one. You probably want to be using explicitly-created WorkingSet instances for this anyway. (easy_install certainly does.)
I am creating a fresh one every time... So after I deinstall, the package is still shown, even though it's no longer in the system. It's still in the interpretors memory. When I look at pkg_resources.. I can't find "the real code".. it seems to be all just 'jumps'.. I fully understand this refreshing of packages is outside of the original code but I am at a loss to find where in the system I can find the code that implements workingset... If you have any idea where that code is... I would be very greatful... David

At 10:45 PM 5/16/2009 -0400, David Lyon wrote:
On Sat, 16 May 2009 22:36:13 -0400, "P.J. Eby" <pje@telecommunity.com> wrote:
Just create a fresh WorkingSet object, and use that one. You probably want to be using explicitly-created WorkingSet instances for this anyway. (easy_install certainly does.)
I am creating a fresh one every time...
So after I deinstall, the package is still shown, even though it's no longer in the system. It's still in the interpretors memory.
Do you mean on sys.path? If it's an .egg file or directory on sys.path it will probably still be listed as a distribution when you create a new WorkingSet. Either that, or it's an .egg-info egg whose .egg-info you didn't delete.

On Sun, 17 May 2009 00:43:20 -0400, "P.J. Eby" <pje@telecommunity.com> wrote:
So after I deinstall, the package is still shown, even though it's no longer in the system. It's still in the interpretors memory.
Do you mean on sys.path? If it's an .egg file or directory on sys.path it will probably still be listed as a distribution when you create a new WorkingSet. Either that, or it's an .egg-info egg whose .egg-info you didn't delete.
I have to restart the program for the refresh to work... Ideally, I would like to see the package list refreshed immediately after a package is installed or removed. I can make a work around and hide packages that are deinstalled, but i was hoping to find an easier way. Checking for C:\Python25\scripts\easy_install.exe Running installer ... C:\Python25\scripts\easy_install.exe demset Searching for demset Reading http://pypi.python.org/simple/demset/ Reading http://deron.meranda.us/python/demset/ Best match: demset 1.0 Downloading http://deron.meranda.us/python/demset/dist/demset-1.0.tar.gz Processing demset-1.0.tar.gz Running demset-1.0\setup.py -q bdist_egg --dist-dir c:\docume~1\david\locals~1\temp\easy_install-ozryy7\demset- 1.0\egg-dist-tmp-jiy7q0 zip_safe flag not set; analyzing archive contents... Adding demset 1.0 to easy-install.pth file Installed c:\python25\lib\site-packages\demset-1.0-py2.5.egg Processing dependencies for demset Finished processing dependencies for demset Preparing to remove demset - package mentioned in PTH c:\python25\lib\site-packages\easy-install.pth - Updating PTH c:\python25\lib\site-packages\easy-install.pth - Removing Egg c:\python25\lib\site-packages\demset-1.0-py2.5.egg - deinst completed demset

At 12:55 AM 5/17/2009 -0400, David Lyon wrote:
On Sun, 17 May 2009 00:43:20 -0400, "P.J. Eby" <pje@telecommunity.com> wrote:
So after I deinstall, the package is still shown, even though it's no longer in the system. It's still in the interpretors memory.
Do you mean on sys.path? If it's an .egg file or directory on sys.path it will probably still be listed as a distribution when you create a new WorkingSet. Either that, or it's an .egg-info egg whose .egg-info you didn't delete.
I have to restart the program for the refresh to work...
Ideally, I would like to see the package list refreshed immediately after a package is installed or removed. I can make a work around and hide packages that are deinstalled, but i was hoping to find an easier way.
You didn't answer my question.

On Sun, 17 May 2009 10:29:33 -0400, "P.J. Eby" <pje@telecommunity.com> wrote:
Do you mean on sys.path? If it's an .egg file or directory on sys.path it will probably still be listed as a distribution when you create a new WorkingSet. Either that, or it's an .egg-info egg whose .egg-info you didn't delete.
Ok, you're right. I didn't update the sys.path... I will go do that. As for the .egg file/directory - they are deleted. Yes - I didn't check for .egg-info's. Thanks - I'll try that and see if it fixes it. David

On Fri, May 15, 2009 at 12:25:31PM -0400, P.J. Eby wrote:
At 01:00 PM 5/15/2009 +0200, Brian Sutherland wrote:
Hi,
It may seem like a backwards way of doing things, but I have a need for a utility that can maintain a python package index mirror of a Debian repository.
The basic idea is to extract the tarballs of Python packages from the Debian repository and rename them to the original setuptools name. It should also create a buildout-compatible versions file of the versions in the repository.
My current implementation idea is to unpack the tarball and use the egg-metadata to figure out what the "egg" name of the tarball should be.
Running "setup.py --name --version" will dump out the name and version, whether you use distutils or setuptools. If you want a setuptools-compatible name and version, you'll need to postprocess those strings with pkg_resources.safe_name() and safe_version(), then escape them with to_filename() if you're using them as components of a sdist or egg filename.
Thanks, I've just released a prototype: http://pypi.python.org/pypi/van.reposync However, as I did not want to actually execute code contained in the tarball, I did something like this: basedir = os.path.dirname(egg_info) metadata = PathMetadata(basedir, egg_info) dist_name = os.path.splitext(os.path.basename(egg_info))[0] dist = Distribution(basedir, project_name=dist_name, metadata=metadata) Unfortunately I had to play some guessing games to find where the .egg-info directory was. -- Brian Sutherland

At 01:10 PM 6/15/2009 +0200, Brian Sutherland wrote:
Thanks, I've just released a prototype:
http://pypi.python.org/pypi/van.reposync
However, as I did not want to actually execute code contained in the tarball, I did something like this:
basedir = os.path.dirname(egg_info) metadata = PathMetadata(basedir, egg_info) dist_name = os.path.splitext(os.path.basename(egg_info))[0] dist = Distribution(basedir, project_name=dist_name, metadata=metadata)
Unfortunately I had to play some guessing games to find where the .egg-info directory was.
Actually, if all you want is the name and version, I suppose you could parse the PKG-INFO, which is always in a set location in an sdist tarball or zip.

On Mon, Jun 15, 2009 at 11:08:22AM -0400, P.J. Eby wrote:
At 01:10 PM 6/15/2009 +0200, Brian Sutherland wrote:
Thanks, I've just released a prototype:
http://pypi.python.org/pypi/van.reposync
However, as I did not want to actually execute code contained in the tarball, I did something like this:
basedir = os.path.dirname(egg_info) metadata = PathMetadata(basedir, egg_info) dist_name = os.path.splitext(os.path.basename(egg_info))[0] dist = Distribution(basedir, project_name=dist_name, metadata=metadata)
Unfortunately I had to play some guessing games to find where the .egg-info directory was.
Actually, if all you want is the name and version, I suppose you could parse the PKG-INFO, which is always in a set location in an sdist tarball or zip.
Yep, I changed to that method and managed to remove the guessing games. -- Brian Sutherland

On 09-05-15 09:25 AM, P.J. Eby wrote:
My current implementation idea is to unpack the tarball and use the egg-metadata to figure out what the "egg" name of the tarball should be.
Running "setup.py --name --version" will dump out the name and version, whether you use distutils or setuptools. If you want a setuptools-compatible name and version, you'll need to postprocess those strings with pkg_resources.safe_name() and safe_version(), then escape them with to_filename() if you're using them as components of a sdist or egg filename.
There are two issues with relying on 'setup.py --<field>' to find the value of <field>: 1) setuptools prints warning messages to stdout: http://bugs.python.org/setuptools/issue73 (does `safe_name()` handle this?) 2) Some packages in PyPI prints unexpected messages depending upon the environment (even if you pass --name to setup.py). How do you suggest to find name/version under such cases?

Hi Brian, It sounds interesting. I might be interested in helping. What I would like to do is make a test script to download all the packages off pypi and build them under multiple platforms. Basically, I want to make some tests that will try to install "everything" and then deinstall "everything" and see what happens. I'm not sure if this has ever been done.. but it might provide some interesting test results.. Maybe it's similar... On Fri, 15 May 2009 13:00:41 +0200, Brian Sutherland <brian@vanguardistas.net> wrote:
Hi,
It may seem like a backwards way of doing things, but I have a need for a utility that can maintain a python package index mirror of a Debian repository.
The basic idea is to extract the tarballs of Python packages from the Debian repository and rename them to the original setuptools name. It should also create a buildout-compatible versions file of the versions in the repository.
My current implementation idea is to unpack the tarball and use the egg-metadata to figure out what the "egg" name of the tarball should be.
Does such a tool exists? If not, I'll probably start working on one on svn.zope.org Real Soon Now (TM). Comments, suggestions much appreciated!

On Sat, May 16, 2009 at 07:13:43AM -0400, David Lyon wrote:
Hi Brian,
Hi David,
It sounds interesting. I might be interested in helping.
Great :)
What I would like to do is make a test script to download all the packages off pypi and build them under multiple platforms.
Basically, I want to make some tests that will try to install "everything" and then deinstall "everything" and see what happens.
I'm not sure if this has ever been done.. but it might provide some interesting test results..
It think that could give some good information, especially about dependency issues, and perhaps file conflicts.
Maybe it's similar...
Unfortunately I don't think it's similar enough to fold these 2 projects into one. I'm basically trying to re-make pypi (or a list of links to pypi) from a Debian repository. I.e something that looks like this: http://ftp.us.debian.org/debian/ Also, while I will probably be downloading most of the tarballs, I definitely won't be installing anything.
On Fri, 15 May 2009 13:00:41 +0200, Brian Sutherland <brian@vanguardistas.net> wrote:
Hi,
It may seem like a backwards way of doing things, but I have a need for a utility that can maintain a python package index mirror of a Debian repository.
The basic idea is to extract the tarballs of Python packages from the Debian repository and rename them to the original setuptools name. It should also create a buildout-compatible versions file of the versions in the repository.
My current implementation idea is to unpack the tarball and use the egg-metadata to figure out what the "egg" name of the tarball should be.
Does such a tool exists? If not, I'll probably start working on one on svn.zope.org Real Soon Now (TM). Comments, suggestions much appreciated!
-- Brian Sutherland
participants (4)
-
Brian Sutherland
-
David Lyon
-
P.J. Eby
-
Sridhar Ratnakumar