Hi Holger,

I think you might run into a 32K entries per directory restriction
when uploading so many packages to a private index.

I've installed your most recent changes from bitbucket in regards to this subject.
Let's see how it goes!

Are you doing this only because you want to avoid forgetting 
about your downloads? 

I'm doing this because I had already my fair share of troubles with PyPI and I'd like to rule out details about whether PyPI is available or not. I simply cannot depend on external factors when dealing with systems in production. As you mentioned before, I could employ bandersnatch... but I definitely don't see reason for adopting more than one tool for exactly one purpose. Besides, I like the idea of having multiple indexes whilst developing applications. So, again, devpi fits the bill very well. I just need to upload those 38K+ packages in devpi once in a lifetime and probably never to think about this again.

Cheers :)

Richard Gomes
http://rgomes.info
http://www.linkedin.com/in/rgomes
mobile: +44(77)9955-6813
inum: +883(5100)0800-9804
sip:r...@ippi.fr

On 26/01/14 14:06, holger krekel wrote:
hi Richard,

On Sun, Jan 26, 2014 at 13:55 +0000, Richard Gomes wrote:
Hello Holger,

Thanks a lot for your prompt answer.
Yes, I understand that we'd better keep /root/pypi as it is.

I've created an inherited index based on /root/pypi, like shown below:

(py276)rgomes@pypi:/srv/pypi$ devpi index /root/pypimirror
http://pypi.localdomain:8080/root/pypimirror:
  type=stage
  bases=root/pypi
  volatile=True
  uploadtrigger_jenkins=None
  acl_upload=root

I'm currently uploading packages onto this index.
I will let you know later how it goes in regards to difficulties I had
and performance observed.
I think you might run into a 32K entries per directory restriction
when uploading so many packages to a private index.

Am I correct to think that, if pip retrieves a packages which was
updated on PyPI, it (pip) will get the updated package because
/root/pypimirror inherits from /root/pypi which keeps itself in sync
with PyPI?
I think this should work, yes.

Are you doing this only because you want to avoid forgetting 
about your downloads? 

best,
holger


Thanks

Richard Gomes
http://rgomes.info
http://www.linkedin.com/in/rgomes
mobile: +44(77)9955-6813
inum <http://www.inum.net/>: +883(5100)0800-9804
sip:r...@ippi.fr

On 26/01/14 07:56, holger krekel wrote:
Hello Richard,

On Sun, Jan 26, 2014 at 02:39 +0000, Richard Gomes wrote:
Hello Holger,

I understood the point about triggering all index pages.
But I guess that it would download everything [again] from PyPI, correct?
When a fresh devpi-server starts up, nothing is done except getting
the list of all projects from pypi.python.org.  Only the triggering
of index pages will cache them, and only accessing a specific archive
file will cache that.

So, in order to avoid another wasteful full download, cos I've already
done that... I'm trying to upload packages from the a local folder.
If you want to upload pypi.python.org packages you have to do that
into a private index and that will not be automatically kept up to
date via the syncing mechanism.  You would have to do any updates
by hand.  There is no way around that, volatile index or not.
/root/pypi is an index where you cannot change properties like volatile.
The error message could be better, agreed.

The only thing we could consider is adding an option to devpi-server
that allows looking up archive files from a local directory (structure)
before trying a remote operation.  We know the md5 checksum and filename
and so can match precisely with already downloaded files.
But don't hold your breath on that.

cheers,

holger

Unless you tell me it is not going to work, I'd like to persist on this
route. You know... I can take this chance to gain some mileage with devpi :)

First thing would be making /root/pypi volatile (if I understood properly!).


# let's play with /root/test first, just to see how it works
(py276)rgomes@pypi:/srv/pypi$ devpi index /root/test volatile=False
/root/test changing volatile: False
http://pypi.localdomain:8080/root/test:
  type=stage
  bases=root/pypi
  volatile=False
  uploadtrigger_jenkins=None
  acl_upload=root
[87112 refs]

(py276)rgomes@pypi:/srv/pypi$ devpi index /root/test volatile=True
/root/test changing volatile: True
http://pypi.localdomain:8080/root/test:
  type=stage
  bases=root/pypi
  volatile=True
  uploadtrigger_jenkins=None
  acl_upload=root
[87112 refs]

# now let's try with /root/pypi
(py276)rgomes@pypi:/srv/pypi$ devpi index /root/pypi volatile=True
/root/pypi changing volatile: True
http://pypi.localdomain:8080/root/pypi:
  type=mirror
  bases=
  volatile=False
Traceback (most recent call last):
  File "/home/rgomes/.virtualenvs/py276/bin/devpi", line 9, in <module>
    load_entry_point('devpi-common==1.2', 'console_scripts', 'devpi')()
  File
"/home/rgomes/.virtualenvs/py276/lib/python2.7/site-packages/devpi/main.py",
line 29, in main
    return method(hub, hub.args)
  File
"/home/rgomes/.virtualenvs/py276/lib/python2.7/site-packages/devpi/index.py",
line 65, in main
    return index_modify(hub, url, kvdict)
  File
"/home/rgomes/.virtualenvs/py276/lib/python2.7/site-packages/devpi/index.py",
line 14, in index_modify
    index_show(hub, url)
  File
"/home/rgomes/.virtualenvs/py276/lib/python2.7/site-packages/devpi/index.py",
line 36, in index_show
    ixconfig["uploadtrigger_jenkins"],))
KeyError: 'uploadtrigger_jenkins'
[87112 refs]

Hum... not very good :(

I've opened issue #82
https://bitbucket.org/hpk42/devpi/issue/82/cannot-make-root-pypi-volatile

Meanwhile, I will play with another index and let you know how it goes.

Thanks

Richard Gomes
http://rgomes.info
http://www.linkedin.com/in/rgomes
mobile: +44(77)9955-6813
inum <http://www.inum.net/>: +883(5100)0800-9804
sip:r...@ippi.fr

On 25/01/14 09:29, holger krekel wrote:
Hi Richard,

On Fri, Jan 24, 2014 at 21:51 +0000, Richard Gomes wrote:
Hello Holger,

No, I haven't tried bandersnatch.
I think devpi is perfect for my workflow and I'm not willing to try
other things.
ok.

Sorry for not being very clear on my intentions.
My fault. I made the question more complicated than it should be.

In a nutshell, I just wanted a full PyPI mirror, but I'm not sure how I
should load all those 38K packages into devpi cache.
I guess that I should make /root/pypi volatile and I should upload
package by package onto it.
Does it make sense?
There are 38K projects but many more archive files.  If you want a full
mirror, that's around 40Gbytes of storage (and the according network traffic).

What i think makes more sense is to go for triggering all index pages
by iterating over all projects and maybe get and thereby cache the first
(highest version) archive files.  You can get some ideas how to do that
in the repository at server/extra/compare_devpi_server.py e.g. through
the getnames() function.

I'm basically confused about how devpi decides (or detects) that
eventually a package must be updated from PyPI since I've uploaded it by
hand. I'm not sure if this idea of uploading by hand would work well or
would eventually make devpii confused about when new updates should be
downloaded from PyPI.
One you touched/retrieved all project index pages, devpi-server will
auto-update all index pages for those index pages.

HTH,
holger


http://rgomes.info
http://www.linkedin.com/in/rgomes
mobile: +44(77)9955-6813
inum <http://www.inum.net/>: +883(5100)0800-9804
sip:r...@ippi.fr

On 24/01/14 20:05, holger krekel wrote:
On Fri, Jan 24, 2014 at 04:02 -0800, Richard Gomes wrote:
I've downloaded all packages from PyPI and I'd like to create a local 
mirror using devpi.
If you want to have a full non-lazy mirror, did you consider using
bandersnatch?

I had the following idea:

     1. create a new index say /root/pypimirror based on /root/pypi

     2. upload the entire folder containing 38,000+ packages onto 
/root/pypimirror
This might fail if your system uses a 32K limit on directory entries.
I've just fixed it for /root/pypi (not released yet) and i can also
fix it for private indices.

     3. eventually setting /root/pypimirrot to NotVolatile (if this can be 
done, somehow)
You can change index volatility any time.

    /myuser/myindex
        +-- /root/pypimirror
                +-- /root/pypi

Another idea would be creating a "parallel" index to /root/pypi and 
exposing a third index which derives from both.

    /myuser/myindex
        +-- /root/pypi
        +-- /root/pypimirror


Does this idea make sense? Is there a better way of doing it, in particular 
without having to download everything again from PyPI?
Could you state more clearly what you want to achieve in the first
place?  I can see a number of possible motivations but would like
to understand your particular ones.  (Did i mention that i was always
very bad in school at understanding textual questions in math questions
although others seemed to be able to guess the correct meaning of the 
question? :)

I have another concern: performance. Since 38,000+ packages implies on 
large directories in the file system... do you think that devpi will have 
troubles managing such amount of packages?
see above.  /root/pypi handles >32K fine on trunk.  And private indices
can also be made to do so. (So far there wasn't a use case for >32K 
private packages -- let's see if we have one here)

best,
holger



Thanks a lot,

-- Richard

-- 
You received this message because you are subscribed to the Google Groups "devpi-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to devpi-dev+...@googlegroups.com.
To post to this group, send email to devp...@googlegroups.com.
Visit this group at http://groups.google.com/group/devpi-dev.
For more options, visit https://groups.google.com/groups/opt_out.