[Catalog-sig] Package Quality Measurement for packages on Pypi

Terry Reedy tjreedy at udel.edu
Wed Nov 18 20:32:24 CET 2009


David Lyon wrote:
>> (cut from python-dev)
>> On Fri, 13 Nov 2009 01:14:54 +0100, "Martin v. Löwis"
> <martin at v.loewis.de>
>> wrote:
>>
>>> http://pycheesecake.org/
>>
>> Apparently, there is a service running somewhere that computes cheesecake
> 
>> data for PyPI packages;
>> it also sends them to PyPI. People have expressed to concerns that any
>> kind of ranking based on kwalitee sounds fairly useless.

I would like to see something like
Cheesecake rating NNN/MMM
where the rating links to the full report so I can decide whether the 
missed points are things I am concerned about or not.

> I've had a look at this and been able to run the assessments locally
> on a number of packages that I've been able to download from pypi. It's
> interesting.

The fact that a 3rd party can download, install, and run the assessment 
indicates that it meets some bare miminum of quality. The system used 
should be included in the report.

> I totally agree that some people might have issues with the terminology 
> being used there.

I remember being dissed for opining that 'cheeseshop' was too much of an 
esoteric and cutesy insider joke to be the public name and url for the 
catalog. While I appreciate that 'kwalitee' is intended to evade 
discussion about whether it really measures 'quality', I can see how 
someone who had not read the explanation could be put off by it. 
'Cheesecake rating' is pretty neutral, which still paying homage to M.P.

> What if the terminology could be cleaned up? And the tests extended?

If the main directly includes 'test_all.py', that could potentially be 
run and the last line of the result reported, and bonus points given for 
passing. Perhaps some standard is needed for reporting success.

Another possibility with binaries to run a checksum and then a virus 
checker. The report should be something neutral like "Virus checker xxx 
found no problems with binary a.b with checksum MMM."

As a Windows user, I worry about the possibility of a malware author 
masquerading as a fake developer, or even just a real developer 
unknowingly having a clever virus that silently somehow piggybacks on 
exported binaries. I believe there are websites that will run multiple 
checkers on submitted files.

Perhaps PyPI should push in the direction of Python download pages like

http://python.org/download/releases/3.1.1/

reporting

"The source tarballs is signed with Benjamin Peterson's key 
(fingerprint: 12EF 3DC3 8047 DA38 2D18 A5B9 99CD EA9D A413 5B38). The 
Windows installers was signed by Martin von Löwis' public key which has 
a key id of 7D9DC8D2. The public keys are located on the download page.

MD5 checksums and sizes of the released files:

f1317dbb2398374d6691edd5bff1b91d  11525876 python-3.1.1.tgz
d1ddd9f16e3c6a51c7208f33518cd674   9510105 python-3.1.1.tar.bz2
d31e3e91c2ddd3e5ea7c40abe436917e  14130176 python-3.1.1.amd64.msi
e05a6134b920ae86f0e33b8a43a801b3  13737984 python-3.1.1.msi
9c7f85cc7fb5a2fa533d338c88229633  17148746 python-3.1.1.dmg
"
(What is missing there is info on how to use this information ;-).

> What do you think would be more meaningful in terms of output? In regards
> to something that could be useful for potential publishing on pypi.

Terry Jan Reedy



More information about the Catalog-SIG mailing list