[Catalog-sig] Flamenco queries

Phillip J. Eby pje at telecommunity.com
Mon Apr 9 19:54:16 CEST 2007


At 12:00 PM 4/9/2007 +0200, Martin v. Löwis wrote:
> > Perhaps this?
> >
> > select rc.trove_id, count(*)
> >   from releases r, release_classifiers rc
> >  where r.name=rc.name and r.version=rc.version
> >    and r._pypi_hidden=FALSE
> > group by rc.trove_id
>
>This did work indeed (although it was slower than when doing
>the counting in Python...)

It would only be efficient if there were indexes the optimizer could use; 
without at least an index beginning with trove_id, a temporary table would 
be required to build the result.


>I found that it was semantically incorrect, as packages that
>have multiple level-three trove ids should be counted only
>once.

Ah...  that would require something like "count(distinct name+version)" 
instead of count(*), so the Python approach is probably best.



More information about the Catalog-sig mailing list