dict method to retrieve a key from a value
data:image/s3,"s3://crabby-images/a6aeb/a6aebc0e78be4639e8d4b7f112f8324a3ef9c264" alt=""
A dict method to retrieve the key of a value from a bijective dict would have come in handy to me in several occasions:
My usual use case is when both keys and values have simple types like int/str, like: { 'account-a': 123, 'account-b': 456, } Users may enter the account name, or the ID. The system works with the account ID (translating from the account name if needed), but when it needs to mention the account to the user, it shows the account name instead. I do understand that retrieval wouldn't be O(1).
data:image/s3,"s3://crabby-images/d1d84/d1d8423b45941c63ba15e105c19af0a5e4c41fda" alt=""
Andre Delfino writes:
If O(log N) is good enough and bijectivity is guaranteed by some other mechanism, a bisection search on d.items() with key=lambda x: x[1] does the trick. If you know the dict is supposed to be one-to-one and the keys and values are disjoint sets, as in your example, just def bijective_add(d, k, v): if k in d and d[k] != v: raise BijectiveDictValueChangeError d[k] = v d[v] = k gives O(1) both ways. Maybe you need some kind of check to be sure you're retrieving the right type for the calling code. Otherwise you can pair two dicts in a class and endow it with your inverse method. I would do the check for bijectivity on addition (as above) rather than on retrieval, though. If you really want to save the space, you can add an additional hashtable for the values in a C module, but it's not clear to me that any particular choice for bijectivity checks would be universally desirable in applications so that seems premature. So I think you need to make clear what your goals are since there are at least four solutions with varying performance characteristics. I'm agnostic on whether a dict type which guarantees bijectivity would be a good addition. The mathematician in me wants it, but my experience says the dict pair is good enough, YAGNI (for values of "you" == "me", anyway). Regards, Steve
data:image/s3,"s3://crabby-images/a3b9e/a3b9e3c01ce9004917ad5e7689530187eb3ae21c" alt=""
I've had use cases for this, and written a class to do it (using Stephen's two dict approach. I'd be really surprised if there wasn't one on PyPi --the trick is to know what to call it to search for it. Now that I think about it -- my use case was many-to-many, not one to one -- so not quite the same -- each value was a list associated with a given key -- and vice versa. I think it would be a bad idea to overload the built in dict with anything like this. *maybe* a new class for the collections module, if you can find a good well respected implementation, and folks can agree on what's wanted (performance characteristics, API, etc..) -- that's a pretty tall order though. If O(log N) is good enough and bijectivity is guaranteed by some other
mechanism, a bisection search on d.items() with key=lambda x: x[1] does the trick.
You'd also have to keep it sorted by value. So now you are guaranteeing bijectivity and keeping it sorted -- I'd just use two dicts :-) Though then the values would have to be hashable, so there's that. -CHB
-- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
data:image/s3,"s3://crabby-images/d1d84/d1d8423b45941c63ba15e105c19af0a5e4c41fda" alt=""
Christopher Barker writes:
You'd also have to keep it sorted by value.
I assumed you can do that with OrderedDict. But yeah, it's a little more complex. Maybe some of these ideas are complex and useful enough to deserve PyPI implementations, and if they prove to have corner cases and get take up, then consideration for stdlib.
Right (and I missed that throughout, so tyvm). So much as the "generic" theory of "low O" "bijection" is attractive, the implementation details are application-specific. Steve
data:image/s3,"s3://crabby-images/4b5e0/4b5e022859fb6ce3561c44f5cb25ffe769ccdca4" alt=""
In my experience, the implementation of bijective dict is largely application specific. 1. Unique-valued dict is fairly straight forward and there is little ambiguity on implementation using 2 dicts. 2. However, if values are not to be unique, then it largely depends on application. (best (as in most efficient) pypi I have found for this: https://pypi.org/project/indexed/ <https://pypi.org/project/indexed/>) But to me it feels that there is some sort of gap in container space. E.g. I have spent a reasonable amount of time on: a) 1-2-1 dict b) many-2-many dict c) dict-deque d) bijective-dict-deque e) list-dict Maybe it would be good to have similar package to `more_itertools`. E.g. `more_collections`, where assorted recipes are implemented. I am not a fan of making my libraries dependent on less known pypi packages, but when a package is referenced in python official docs, then I am much more at ease.
data:image/s3,"s3://crabby-images/a3b9e/a3b9e3c01ce9004917ad5e7689530187eb3ae21c" alt=""
Google: "python more collections pypi" gets quite a few hits -- I haven't checked out any of them though. Depending on a third party package does take some thought -- but for something like this you could simply "vendor" a particular class (i.e. include the code with yours). Totally different topic, but I do think that a "curated" package repo would be helpful -- there is a lot of cruft on PyPi :-( -CHB On Fri, Jun 30, 2023 at 3:47 AM Dom Grigonis <dom.grigonis@gmail.com> wrote:
-- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sat, 1 Jul 2023 at 01:15, Christopher Barker <pythonchb@gmail.com> wrote:
Totally different topic, but I do think that a "curated" package repo would be helpful -- there is a lot of cruft on PyPi :-(
That idea gets thrown around every once in a while, but there are a few problems with it. When you "bless" one package, every other package doing a similar job will suffer, even if they are just as good (but simply haven't been added to the curated collection). If the PSF recommends a package, people will expect a lot of it, which is a huge burden on the developer(s). And someone has to go through all those packages, and then discuss it with whoever else has to be responsible for this curated collection, and come to an agreement. Instead, what I'd like to see is: Personal, individual blogs, recommending packages that the author knows about and can give genuine advice about. Provide YOUR curated collection. Then maybe a metapage on the Python Wiki could link to some useful/interesting blog posts. Decentralize! ChrisA
data:image/s3,"s3://crabby-images/a3b9e/a3b9e3c01ce9004917ad5e7689530187eb3ae21c" alt=""
On Fri, Jun 30, 2023 at 8:34 AM Chris Angelico <rosuav@gmail.com> wrote:
Well yes, many .... I think there are a lot of packages that we could all agree are cruft -- pre-release stuff that hasn't been updated in years, etc, etc. Then there are those that have become pseudo standards: numpy, requests, more-itertools.. Then there is EVERYTHING in between -- which is most (by number anyway). So what would a "curated" package repo be? I'm not sure -- though I'd like to see even a small amount of curation -- some barrier to get over so that we don't have the confusion of the real cruft. Unfortunately, the laudable goal of a low barrier to entry for putting a package up on PyPi, and the culture of packaging documentation oriented to PyPi means that a lot of folks put stuff up there even though there are few if any other users. So I think light curation would help a lot. [*] If the PSF recommends a package Who said anything about the PSF? ;-) -- but yes, that would be another way to go -- a tightly curated collection -- lower barrier to entry than the standard library, but still pretty high. Which is a huge burden on the developer(s). Sure -- but it should be, that's kind of the point -- the idea is to have a way to identify high quality well maintained packages.
And someone has to go through all those
packages, and then discuss it with whoever else has to be responsible for this curated collection, and come to an agreement.
yup -- that's the biggest problem right there.
Are there not a lot of these already? -- That's how the current "cream of the crop" has risen for years. But it doesn't solve the OP's issue -- IIUC, they want to have some assurance that a given package is something that can be relied on, without having to do a bunch of research. Decentralize!
Actually, I think the Decentralized nature of what we have now is part of the problem. But this does give me an idea -- a single site that can collect recommendations and reviews -- maybe even as part of PyPi itself -- that could help folks find the good ones. -Chris B [*] conda-forge is an example of light curation -- nothing goes on the conda-forge channel without approval of the conda-forge core team. But they are reviewing only the conda package itself -- is it built right?, is it compatible with the rest of conda-forge?, does it have its license included?, ... -- not the quality or usefulness of the package itself. But this barrier to entry means that no one puts anything up there unless they have a good reason to, and there are some assurances that things will work together. I think even that helps a lot.
-- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sat, 1 Jul 2023 at 16:09, Christopher Barker <pythonchb@gmail.com> wrote:
If the PSF recommends a package
Who said anything about the PSF? ;-) -- but yes, that would be another way to go -- a tightly curated collection -- lower barrier to entry than the standard library, but still pretty high.
Who other than the PSF? (PyPA would be, in the general public's eyes, equivalent - everything I said about the PSF would apply.) ChrisA
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On Sat, 1 Jul 2023 at 07:09, Christopher Barker <pythonchb@gmail.com> wrote:
So I think light curation would help a lot. [*]
I'd be completely in favour of someone setting up a curated package index. You could probably use the Warehouse codebase if you didn't want to write your own. There would probably be a small amount of work to do re-branding. You might also need to write something to put a moderation layer on the upload interface. I'm not familiar with the Warehouse codebase, so I don't know what that would involve. Assuming it gets sufficient popularity, it could apply for PyPA membership if it wanted to be "official" (whatever that means ;-)) The problem isn't so much with the idea, it's with the fact that everyone likes talking about it, but no-one will actually *do* it. And everyone underestimates the amount of work involved - running PyPI, with its bare minimum curation (blocking malware and typosquatting) is a huge effort. Why do people think a new index with ambitions of more curation would take *less* effort? Or do people have the sort of resources that PyPI consumes lying around looking for something to do? Because if so, there's plenty of other projects looking for resources (a PyPI build farm, anyone?) Who said anything about the PSF?
Nobody, I guess, but it's symptomatic of what I said above - everyone assumes *someone else* will do the work, and the convenient "someone else" is usually the PyPA or the PSF. Paul
data:image/s3,"s3://crabby-images/4b5e0/4b5e022859fb6ce3561c44f5cb25ffe769ccdca4" alt=""
Another idea, maybe this group could host a simple git repo. If idea of extension is reasonable, but it’s not for standard library, open source is not reliable or it's time for consensus and centralisation. Then people can make some PRs, see where it goes. In this case, OP could scan PyPI, make initial PR.
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sat, 1 Jul 2023 at 22:32, Dom Grigonis <dom.grigonis@gmail.com> wrote:
Another idea, maybe this group could host a simple git repo.
This has the same problem of who is curating it. If it's uncurated, that's PyPI as it already is. If it's controlled by the PyPA or PSF, then it gives too much authority. ChrisA
data:image/s3,"s3://crabby-images/4b5e0/4b5e022859fb6ce3561c44f5cb25ffe769ccdca4" alt=""
If it’s curated by more experienced members of this mailing list I would feel more confident in depending on it and more keen to contribute and review PRs. Maybe, with luck, if some good robust solution arises, it could be streamlined to python core library, if deemed appropriate. It could provide some structure how to go forward with certain queries that end up in this mailing list and although potentially legit, do not progress due to lack of clarity of what to do next. More experienced members of this group could guide OPs and give feedback. If OP wants to do some R&D, what’s there to loose? If this was so, maybe by now we would be reviewing a potential PR with references to 3rd party solutions, shortcomings and proposed implementation with benchmarks. PR/Issue could have a check list. E.g. 1. What potential Python stdlib solutions have you found and how were they lacking? 2. What potential 3rd party solutions have you found and how were they lacking? What’s the worst that can happen?
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, 2 Jul 2023 at 07:00, Dom Grigonis <dom.grigonis@gmail.com> wrote:
If it’s curated by more experienced members of this mailing list I would feel more confident in depending on it and more keen to contribute and review PRs. Maybe, with luck, if some good robust solution arises, it could be streamlined to python core library, if deemed appropriate.
What if those people disagree? That's why, in my personal opinion, it's personal opinions that should be posted, not any sort of authoritative list. That way, people can give those opinions the weight they deem fit. A nice collection of links to people's own personal recommendations would be both easier to do, and easier to not get wrong, than a formal and centralized listing. This eliminates the question of "who deserves to be the one to say what's good and what's not", and decentralizes the "but what about this one, you forgot this one" problem. However, the bigger problem of "who wants to actually go to the effort to make this happen?" still remains, as it always will.
What’s the worst that can happen?
The less centralized it is, the less bad things can happen. In fact, YOU could post a recommended list of packages if you want to! What *could* go wrong? You might forget to mention a really awesome package. No problem - someone else can. You might mention a package that someone thinks is trash. No problem - it's a personal opinion. You might mention something that doesn't support current versions of Python. Not a huge problem - that sort of thing happens, people have to do their own research anyway. (All of these would be a bit more serious if the listing were centralized, although not THAT big a deal even then.) Want to start things off? ChrisA
data:image/s3,"s3://crabby-images/4b5e0/4b5e022859fb6ce3561c44f5cb25ffe769ccdca4" alt=""
What’s the worst that can happen?
The less centralized it is, the less bad things can happen.
True, but the risk can be minimised if only appropriate cases were streamlined. Only well researched, well defined, unambiguous problems, that depend on mature components of core python and are lacking in PyPI and stdlib.
Want to start things off?
I don’t think I am in position to create a repo for this group, even deciding on a repo name is beyond my competence level here. And in the end, it’s just an idea, I am honestly not sure if it’s a good one. If someone more competent created it, OP of this request wanted to work on his query, others were positive on this idea and took time to give their 2 cents, then, if needed, I could take a role of one of the admins. In this particular case, having a robust bijective or more general reversible dict implementation would have saved me a fair bit of time, so I do have a bias here and inclination to contribute if it had a potentiality to be fruitful. I would not take this particular problem on, but if OP took it, I would gladly give my 2 cents of code and if solid solution came out I would happily use it. DG
data:image/s3,"s3://crabby-images/a3b9e/a3b9e3c01ce9004917ad5e7689530187eb3ae21c" alt=""
On Sat, Jul 1, 2023 at 3:24 PM Dom Grigonis <dom.grigonis@gmail.com> wrote:
The less centralized it is, the less bad things can happen.
Sure, but in a way the less good things can happen as well :-(.
The OP of this thread is not alone -- folks want an authoritative source -- they may not get that, but a bunch of blogs by who knows who is about as far from that you can get. And isn't that what we have already? If you google a question like "what's a good python package for xxxxx?" you generally get hits. I usually do that before I look directly on PyPI, because PyPI has a lot of cruft and it's hard to sort out -- I'd rather start with at least *someone's* recommendation.
Only well researched, well defined, unambiguous problems, that depend on mature components of core python and are lacking in PyPI and stdlib.
Not easy to get consenson on that :-)
I think it may be a good idea, though I'm pretty unclear on exactly what the idea is. I *think* It's a Python package review site, and if so, that could work. You'd need a modest sized core team to review entries, and then allow anyone to contribute a review -- the core team would "simply" decide if it's a decent review or not. LIke anything else, this would only work once there was some critical mass -- enough reviews that folks would notice it. In this particular case, having a robust bijective or more general
reversible dict implementation would have saved me a fair bit of time,
Way to bring it back on topic! Did you look on PyPI ? I looked really quickly and there's some packages there, maybe something good. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, 2 Jul 2023 at 15:11, Christopher Barker <pythonchb@gmail.com> wrote:
The OP of this thread is not alone -- folks want an authoritative source -- they may not get that
An authoritative source is absolutely perfect for someone who wants less choices. "Just give me the one and only option and promise me that it's perfect!" But that isn't the reality we live in, which means that for any non-trivial situation, there won't BE a valid authoritative list of "good packages". As one example, let's try looking for a regular expression parser. The standard library has one already, but PyPI has more. There's "regex", but also 500 pages of other hits for the search "regular expression" (albeit a lot of related tools that aren't actual regex parsers). There's "regular expressions for humans", "regular expressions for objects" (are those two opposites or unrelated?), "structural regular expressions", "objective regular expressions", and that's just from flipping through a couple of pages of summary. Which ones are "good packages"? Only regex? Only re (the one in the stdlib)? What if you want PCREs - there's no package called "pcre" but there's "pcre2", "python-pcre", and probably others. Importantly, the correct answer to this *depends on your use-case*. Which regular expression package do you want? *It depends*. So which one or ones should be in this curated list? Just the one most popular? You can easily find that from a simple web search. All of them? Now we're back to the original problem, but with more barriers to entry for any new package, which will have to appeal to be added to the curated list, lest it dwindle, perish, starve, pine, and die. Just some of them? Which ones? And that's for something relatively simple. How about a web framework? Which ones belong in the curated list? I don't think the PSF or PyPA should be in a position of making this list, because it would carry too much weight, too much importance. But if not them, then who? Hence, decentralization. An authoritative source is the easy solution for the reader, but a terrible one for the publisher, and ultimately, isn't a good solution for the reader either. It's not just a matter of how much work it would be. ChrisA
data:image/s3,"s3://crabby-images/a3b9e/a3b9e3c01ce9004917ad5e7689530187eb3ae21c" alt=""
On Sat, Jul 1, 2023 at 10:24 PM Chris Angelico <rosuav@gmail.com> wrote:
I don't think that it -- how about "Just give a few options that aren't worthless" would really help.
I think you've made my point -- who wants to wade through 500 packages? How many of those packages are reasonably well tested and maintained? I'll bet a good fraction of those are essentially worthless. Getting 50 hits would be a lot more manageable -- it doesn't have to be one. Only re (the one in the stdlib)? What if you want PCREs -
there's no package called "pcre" but there's "pcre2", "python-pcre", and probably others.
And are those three (and others?) actually useful maintained packages? Or someone's abandoned experiment? Who the heck knows without digging into each one? NOTE: I did a (very) quick google to see if someone had written a blog about PCREs in Python that might provide some guidance -- no luck. I like your decentralized blog idea, but I'm not sure how to get people to write them :-) -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, 2 Jul 2023 at 15:48, Christopher Barker <pythonchb@gmail.com> wrote:
Would you know that even if they were on some official curated list? Projects get abandoned at any time, and unless the curators are constantly checking every package for currency, it's no different - except that people will expect it to be different.
NOTE: I did a (very) quick google to see if someone had written a blog about PCREs in Python that might provide some guidance -- no luck. I like your decentralized blog idea, but I'm not sure how to get people to write them :-)
Indeed. But I expect it'd be easier to get one person to write one blog post about their favourite packages than to create some sort of comprehensive list of everything that's good :) ChrisA
data:image/s3,"s3://crabby-images/a3b9e/a3b9e3c01ce9004917ad5e7689530187eb3ae21c" alt=""
On Sat, Jul 1, 2023 at 11:01 PM Chris Angelico <rosuav@gmail.com> wrote:
It's quite different -- because at least the package was useful at SOME point. But yes, maintenance would be required - back to the "it would take a lot of work" problem -- I"m not pretending that that's not the case. However, the amount of time the two of us have spent on this thread could have been used to review the status of quite a few packages :-)
of course it would, but then you'd only know about a few packages -- not a great comparison. However, that's why I like the idea of a centralized package review site -- then you get that one person to write one review of the PCRE packages and post a PR. Then a couple hundred such people and you've got something! But critical mass would be hard to get. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, 2 Jul 2023 at 16:36, Christopher Barker <pythonchb@gmail.com> wrote:
And reviewing those few packages would be just as minimal as anything else, because neither of us has any sort of comprehensive knowledge of packages :) But maybe it'd be a start.
However, that's why I like the idea of a centralized package review site -- then you get that one person to write one review of the PCRE packages and post a PR. Then a couple hundred such people and you've got something! But critical mass would be hard to get.
Every review (where a "review" might cover multiple packages) would have to have a date on it, so people know how useful the information is likely to be. Which means your centralized review site is really just a link to a bunch of separate things anyway. ChrisA
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On Sun, 2 Jul 2023 at 06:11, Christopher Barker <pythonchb@gmail.com> wrote:
The OP of this thread is not alone -- folks want an authoritative source
Authority is built over time. Unless someone starts doing this, there will never be any authoritative source. In case it’s not obvious, none of the projects or groups with existing authority have the resource (and/or the interest) to take this on. The best route to leverage existing authority is for someone to start a new project, and then once it has reached a certain level of maturity, apply for PyPA membership. Unfortunately, too much of this discussion is framed as “someone should”, or “it would be good if”. No-one is saying “I will”. Naming groups, like “the PyPA should” doesn’t help either - groups don’t do things, people do. Who in the PyPA? Me? Nope, sorry, I don’t have the time or interest - I’d *use* a curated index, I sure as heck couldn’t *create* one. Paul.
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On Sat, 1 Jul 2023 at 16:41, MRAB <python@mrabarnett.plus.com> wrote:
How about adding ticks to some PyPI packages? :-)
(There's still the problem of who gets to decide which ones get a tick...)
Precisely. This is just curation but having PyPI host the curators' decisions. And we've already established that the PyPI admins don't have the resources to curate themselves, so it still needs someone to do the job of curating, and now they *also* need to establish trust with the PyPI admins, who don't have the time or knowledge to review the curators' choices. It keeps going round the same circle. The bottleneck is still people. Paul
data:image/s3,"s3://crabby-images/a3b9e/a3b9e3c01ce9004917ad5e7689530187eb3ae21c" alt=""
On Sat, Jul 1, 2023 at 8:42 AM MRAB <python@mrabarnett.plus.com> wrote:
(a PyPI build farm, anyone?)
Funny that you should mention that - I mentioned conda-forge as a lightly curated collection. But it didn't start out with that aim in mind. It started as a way for folks to collaborate on building conda packages. Then it turned into a build farm -- which is really it's primary purpose today -- it just so happens that the process ends up providing some curation :-) So maybe we could learn from that -- one problem with running a curated repository is that not only does the curation take work, but how do you get people to contribute to it? until it reaches substantial critical mass, there'd be zero motivation for someone to put their package on there. conda-forge has been successful because from its beginning, and to this day, it's the only way to make a package easily accessible to the conda community.[*]. Perhaps providing a build system would provide that motivation. But there would still be a massive critical mass problem. -CHB [*] Well, we did give out mugs at SciPy to folks that contributed a package -- but somehow I doubt that made a dent :-)
How about adding ticks to some PyPI packages? :-)
I do wonder if a public rating system ("upvote" or "stars" or ??) could work -- I'm very wary of how it could be gamed or otherwise abused, but maybe there are ways of controlling that? We can get stats on how often different packages are downloaded, which is a start, but a way for the community to highlight the cream of the crop, and everyone to see that would be nice, if it worked. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
data:image/s3,"s3://crabby-images/d1d84/d1d8423b45941c63ba15e105c19af0a5e4c41fda" alt=""
Chris Angelico writes:
On Sat, 1 Jul 2023 at 01:15, Christopher Barker <pythonchb@gmail.com> wrote:
Totally different topic, but I do think that a "curated" package repo would be helpful -- there is a lot of cruft on PyPi :-(
Sounds like a "standard library". I understand the difference, but Like Chris A I'm dubious about whether there's really a lane for it, or whether like bike lanes in Japan you'd just find a lot of illegal parking in it. ;-) But if somebody's going to put in effort to review PyPI, I'd really rather see them go after "typo squatters". Most are probably just clout chasers, but we know that some are malware, far more dangerous than merely "cruft". Chris Angelico writes:
I think this is a good way to go, expecially if reviewers link to each other, building community as well as providing package reviews. For example, while my needs are limited enough that I haven't actually tried any of his stuff, I've found Simon Willison's (datasette.io) tweetqs interesting. (datasette itself, of course, and he's tweeted a lot about LLMs recently too, but here I'm referring to his more random tweets about utilities he's discovered or created.) There are also some idiosyncratic curated package collections (the FLUFL packages, etc), as well as a lot of frameworks (Django, the Zope components, lazr) that seem to sprout utility packages regularly. I'm sure there's a lane if somebody wants to go around blogging about all the "stuff" they see. Steve
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sat, 1 Jul 2023 at 18:43, Stephen J. Turnbull <turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
Ouch :) Though I wish more cities could do what Amsterdam's been doing. Especially the city I live in. Would love to see some walkable areas and real viable bike lanes here. The main difference between standard library and "blessed package" would be the update cycle. Standard library modules can only be updated when Python itself is updated; a blessed package (say, for instance, 'requests') can update any time it wants to.
They're definitely working on that.
Yeah. All we need is for people to start ranting on the internet. Can't be THAT hard right? ChrisA
data:image/s3,"s3://crabby-images/d1d84/d1d8423b45941c63ba15e105c19af0a5e4c41fda" alt=""
Chris Angelico writes:
On Sat, 1 Jul 2023 at 18:43, Stephen J. Turnbull <turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
OK, I'll take that as "Steve, report the ones you've found already". :-) (Since they're a project I work on I feel like I should give an expert opinion, rather than just a "I smell fish" email. :-)
Yeah. All we need is for people to start ranting on the internet. Can't be THAT hard right?
Good rants are hard to find! Steve
data:image/s3,"s3://crabby-images/d1d84/d1d8423b45941c63ba15e105c19af0a5e4c41fda" alt=""
Andre Delfino writes:
If O(log N) is good enough and bijectivity is guaranteed by some other mechanism, a bisection search on d.items() with key=lambda x: x[1] does the trick. If you know the dict is supposed to be one-to-one and the keys and values are disjoint sets, as in your example, just def bijective_add(d, k, v): if k in d and d[k] != v: raise BijectiveDictValueChangeError d[k] = v d[v] = k gives O(1) both ways. Maybe you need some kind of check to be sure you're retrieving the right type for the calling code. Otherwise you can pair two dicts in a class and endow it with your inverse method. I would do the check for bijectivity on addition (as above) rather than on retrieval, though. If you really want to save the space, you can add an additional hashtable for the values in a C module, but it's not clear to me that any particular choice for bijectivity checks would be universally desirable in applications so that seems premature. So I think you need to make clear what your goals are since there are at least four solutions with varying performance characteristics. I'm agnostic on whether a dict type which guarantees bijectivity would be a good addition. The mathematician in me wants it, but my experience says the dict pair is good enough, YAGNI (for values of "you" == "me", anyway). Regards, Steve
data:image/s3,"s3://crabby-images/a3b9e/a3b9e3c01ce9004917ad5e7689530187eb3ae21c" alt=""
I've had use cases for this, and written a class to do it (using Stephen's two dict approach. I'd be really surprised if there wasn't one on PyPi --the trick is to know what to call it to search for it. Now that I think about it -- my use case was many-to-many, not one to one -- so not quite the same -- each value was a list associated with a given key -- and vice versa. I think it would be a bad idea to overload the built in dict with anything like this. *maybe* a new class for the collections module, if you can find a good well respected implementation, and folks can agree on what's wanted (performance characteristics, API, etc..) -- that's a pretty tall order though. If O(log N) is good enough and bijectivity is guaranteed by some other
mechanism, a bisection search on d.items() with key=lambda x: x[1] does the trick.
You'd also have to keep it sorted by value. So now you are guaranteeing bijectivity and keeping it sorted -- I'd just use two dicts :-) Though then the values would have to be hashable, so there's that. -CHB
-- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
data:image/s3,"s3://crabby-images/d1d84/d1d8423b45941c63ba15e105c19af0a5e4c41fda" alt=""
Christopher Barker writes:
You'd also have to keep it sorted by value.
I assumed you can do that with OrderedDict. But yeah, it's a little more complex. Maybe some of these ideas are complex and useful enough to deserve PyPI implementations, and if they prove to have corner cases and get take up, then consideration for stdlib.
Right (and I missed that throughout, so tyvm). So much as the "generic" theory of "low O" "bijection" is attractive, the implementation details are application-specific. Steve
data:image/s3,"s3://crabby-images/4b5e0/4b5e022859fb6ce3561c44f5cb25ffe769ccdca4" alt=""
In my experience, the implementation of bijective dict is largely application specific. 1. Unique-valued dict is fairly straight forward and there is little ambiguity on implementation using 2 dicts. 2. However, if values are not to be unique, then it largely depends on application. (best (as in most efficient) pypi I have found for this: https://pypi.org/project/indexed/ <https://pypi.org/project/indexed/>) But to me it feels that there is some sort of gap in container space. E.g. I have spent a reasonable amount of time on: a) 1-2-1 dict b) many-2-many dict c) dict-deque d) bijective-dict-deque e) list-dict Maybe it would be good to have similar package to `more_itertools`. E.g. `more_collections`, where assorted recipes are implemented. I am not a fan of making my libraries dependent on less known pypi packages, but when a package is referenced in python official docs, then I am much more at ease.
data:image/s3,"s3://crabby-images/a3b9e/a3b9e3c01ce9004917ad5e7689530187eb3ae21c" alt=""
Google: "python more collections pypi" gets quite a few hits -- I haven't checked out any of them though. Depending on a third party package does take some thought -- but for something like this you could simply "vendor" a particular class (i.e. include the code with yours). Totally different topic, but I do think that a "curated" package repo would be helpful -- there is a lot of cruft on PyPi :-( -CHB On Fri, Jun 30, 2023 at 3:47 AM Dom Grigonis <dom.grigonis@gmail.com> wrote:
-- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sat, 1 Jul 2023 at 01:15, Christopher Barker <pythonchb@gmail.com> wrote:
Totally different topic, but I do think that a "curated" package repo would be helpful -- there is a lot of cruft on PyPi :-(
That idea gets thrown around every once in a while, but there are a few problems with it. When you "bless" one package, every other package doing a similar job will suffer, even if they are just as good (but simply haven't been added to the curated collection). If the PSF recommends a package, people will expect a lot of it, which is a huge burden on the developer(s). And someone has to go through all those packages, and then discuss it with whoever else has to be responsible for this curated collection, and come to an agreement. Instead, what I'd like to see is: Personal, individual blogs, recommending packages that the author knows about and can give genuine advice about. Provide YOUR curated collection. Then maybe a metapage on the Python Wiki could link to some useful/interesting blog posts. Decentralize! ChrisA
data:image/s3,"s3://crabby-images/a3b9e/a3b9e3c01ce9004917ad5e7689530187eb3ae21c" alt=""
On Fri, Jun 30, 2023 at 8:34 AM Chris Angelico <rosuav@gmail.com> wrote:
Well yes, many .... I think there are a lot of packages that we could all agree are cruft -- pre-release stuff that hasn't been updated in years, etc, etc. Then there are those that have become pseudo standards: numpy, requests, more-itertools.. Then there is EVERYTHING in between -- which is most (by number anyway). So what would a "curated" package repo be? I'm not sure -- though I'd like to see even a small amount of curation -- some barrier to get over so that we don't have the confusion of the real cruft. Unfortunately, the laudable goal of a low barrier to entry for putting a package up on PyPi, and the culture of packaging documentation oriented to PyPi means that a lot of folks put stuff up there even though there are few if any other users. So I think light curation would help a lot. [*] If the PSF recommends a package Who said anything about the PSF? ;-) -- but yes, that would be another way to go -- a tightly curated collection -- lower barrier to entry than the standard library, but still pretty high. Which is a huge burden on the developer(s). Sure -- but it should be, that's kind of the point -- the idea is to have a way to identify high quality well maintained packages.
And someone has to go through all those
packages, and then discuss it with whoever else has to be responsible for this curated collection, and come to an agreement.
yup -- that's the biggest problem right there.
Are there not a lot of these already? -- That's how the current "cream of the crop" has risen for years. But it doesn't solve the OP's issue -- IIUC, they want to have some assurance that a given package is something that can be relied on, without having to do a bunch of research. Decentralize!
Actually, I think the Decentralized nature of what we have now is part of the problem. But this does give me an idea -- a single site that can collect recommendations and reviews -- maybe even as part of PyPi itself -- that could help folks find the good ones. -Chris B [*] conda-forge is an example of light curation -- nothing goes on the conda-forge channel without approval of the conda-forge core team. But they are reviewing only the conda package itself -- is it built right?, is it compatible with the rest of conda-forge?, does it have its license included?, ... -- not the quality or usefulness of the package itself. But this barrier to entry means that no one puts anything up there unless they have a good reason to, and there are some assurances that things will work together. I think even that helps a lot.
-- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sat, 1 Jul 2023 at 16:09, Christopher Barker <pythonchb@gmail.com> wrote:
If the PSF recommends a package
Who said anything about the PSF? ;-) -- but yes, that would be another way to go -- a tightly curated collection -- lower barrier to entry than the standard library, but still pretty high.
Who other than the PSF? (PyPA would be, in the general public's eyes, equivalent - everything I said about the PSF would apply.) ChrisA
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On Sat, 1 Jul 2023 at 07:09, Christopher Barker <pythonchb@gmail.com> wrote:
So I think light curation would help a lot. [*]
I'd be completely in favour of someone setting up a curated package index. You could probably use the Warehouse codebase if you didn't want to write your own. There would probably be a small amount of work to do re-branding. You might also need to write something to put a moderation layer on the upload interface. I'm not familiar with the Warehouse codebase, so I don't know what that would involve. Assuming it gets sufficient popularity, it could apply for PyPA membership if it wanted to be "official" (whatever that means ;-)) The problem isn't so much with the idea, it's with the fact that everyone likes talking about it, but no-one will actually *do* it. And everyone underestimates the amount of work involved - running PyPI, with its bare minimum curation (blocking malware and typosquatting) is a huge effort. Why do people think a new index with ambitions of more curation would take *less* effort? Or do people have the sort of resources that PyPI consumes lying around looking for something to do? Because if so, there's plenty of other projects looking for resources (a PyPI build farm, anyone?) Who said anything about the PSF?
Nobody, I guess, but it's symptomatic of what I said above - everyone assumes *someone else* will do the work, and the convenient "someone else" is usually the PyPA or the PSF. Paul
data:image/s3,"s3://crabby-images/4b5e0/4b5e022859fb6ce3561c44f5cb25ffe769ccdca4" alt=""
Another idea, maybe this group could host a simple git repo. If idea of extension is reasonable, but it’s not for standard library, open source is not reliable or it's time for consensus and centralisation. Then people can make some PRs, see where it goes. In this case, OP could scan PyPI, make initial PR.
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sat, 1 Jul 2023 at 22:32, Dom Grigonis <dom.grigonis@gmail.com> wrote:
Another idea, maybe this group could host a simple git repo.
This has the same problem of who is curating it. If it's uncurated, that's PyPI as it already is. If it's controlled by the PyPA or PSF, then it gives too much authority. ChrisA
data:image/s3,"s3://crabby-images/4b5e0/4b5e022859fb6ce3561c44f5cb25ffe769ccdca4" alt=""
If it’s curated by more experienced members of this mailing list I would feel more confident in depending on it and more keen to contribute and review PRs. Maybe, with luck, if some good robust solution arises, it could be streamlined to python core library, if deemed appropriate. It could provide some structure how to go forward with certain queries that end up in this mailing list and although potentially legit, do not progress due to lack of clarity of what to do next. More experienced members of this group could guide OPs and give feedback. If OP wants to do some R&D, what’s there to loose? If this was so, maybe by now we would be reviewing a potential PR with references to 3rd party solutions, shortcomings and proposed implementation with benchmarks. PR/Issue could have a check list. E.g. 1. What potential Python stdlib solutions have you found and how were they lacking? 2. What potential 3rd party solutions have you found and how were they lacking? What’s the worst that can happen?
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, 2 Jul 2023 at 07:00, Dom Grigonis <dom.grigonis@gmail.com> wrote:
If it’s curated by more experienced members of this mailing list I would feel more confident in depending on it and more keen to contribute and review PRs. Maybe, with luck, if some good robust solution arises, it could be streamlined to python core library, if deemed appropriate.
What if those people disagree? That's why, in my personal opinion, it's personal opinions that should be posted, not any sort of authoritative list. That way, people can give those opinions the weight they deem fit. A nice collection of links to people's own personal recommendations would be both easier to do, and easier to not get wrong, than a formal and centralized listing. This eliminates the question of "who deserves to be the one to say what's good and what's not", and decentralizes the "but what about this one, you forgot this one" problem. However, the bigger problem of "who wants to actually go to the effort to make this happen?" still remains, as it always will.
What’s the worst that can happen?
The less centralized it is, the less bad things can happen. In fact, YOU could post a recommended list of packages if you want to! What *could* go wrong? You might forget to mention a really awesome package. No problem - someone else can. You might mention a package that someone thinks is trash. No problem - it's a personal opinion. You might mention something that doesn't support current versions of Python. Not a huge problem - that sort of thing happens, people have to do their own research anyway. (All of these would be a bit more serious if the listing were centralized, although not THAT big a deal even then.) Want to start things off? ChrisA
data:image/s3,"s3://crabby-images/4b5e0/4b5e022859fb6ce3561c44f5cb25ffe769ccdca4" alt=""
What’s the worst that can happen?
The less centralized it is, the less bad things can happen.
True, but the risk can be minimised if only appropriate cases were streamlined. Only well researched, well defined, unambiguous problems, that depend on mature components of core python and are lacking in PyPI and stdlib.
Want to start things off?
I don’t think I am in position to create a repo for this group, even deciding on a repo name is beyond my competence level here. And in the end, it’s just an idea, I am honestly not sure if it’s a good one. If someone more competent created it, OP of this request wanted to work on his query, others were positive on this idea and took time to give their 2 cents, then, if needed, I could take a role of one of the admins. In this particular case, having a robust bijective or more general reversible dict implementation would have saved me a fair bit of time, so I do have a bias here and inclination to contribute if it had a potentiality to be fruitful. I would not take this particular problem on, but if OP took it, I would gladly give my 2 cents of code and if solid solution came out I would happily use it. DG
data:image/s3,"s3://crabby-images/a3b9e/a3b9e3c01ce9004917ad5e7689530187eb3ae21c" alt=""
On Sat, Jul 1, 2023 at 3:24 PM Dom Grigonis <dom.grigonis@gmail.com> wrote:
The less centralized it is, the less bad things can happen.
Sure, but in a way the less good things can happen as well :-(.
The OP of this thread is not alone -- folks want an authoritative source -- they may not get that, but a bunch of blogs by who knows who is about as far from that you can get. And isn't that what we have already? If you google a question like "what's a good python package for xxxxx?" you generally get hits. I usually do that before I look directly on PyPI, because PyPI has a lot of cruft and it's hard to sort out -- I'd rather start with at least *someone's* recommendation.
Only well researched, well defined, unambiguous problems, that depend on mature components of core python and are lacking in PyPI and stdlib.
Not easy to get consenson on that :-)
I think it may be a good idea, though I'm pretty unclear on exactly what the idea is. I *think* It's a Python package review site, and if so, that could work. You'd need a modest sized core team to review entries, and then allow anyone to contribute a review -- the core team would "simply" decide if it's a decent review or not. LIke anything else, this would only work once there was some critical mass -- enough reviews that folks would notice it. In this particular case, having a robust bijective or more general
reversible dict implementation would have saved me a fair bit of time,
Way to bring it back on topic! Did you look on PyPI ? I looked really quickly and there's some packages there, maybe something good. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, 2 Jul 2023 at 15:11, Christopher Barker <pythonchb@gmail.com> wrote:
The OP of this thread is not alone -- folks want an authoritative source -- they may not get that
An authoritative source is absolutely perfect for someone who wants less choices. "Just give me the one and only option and promise me that it's perfect!" But that isn't the reality we live in, which means that for any non-trivial situation, there won't BE a valid authoritative list of "good packages". As one example, let's try looking for a regular expression parser. The standard library has one already, but PyPI has more. There's "regex", but also 500 pages of other hits for the search "regular expression" (albeit a lot of related tools that aren't actual regex parsers). There's "regular expressions for humans", "regular expressions for objects" (are those two opposites or unrelated?), "structural regular expressions", "objective regular expressions", and that's just from flipping through a couple of pages of summary. Which ones are "good packages"? Only regex? Only re (the one in the stdlib)? What if you want PCREs - there's no package called "pcre" but there's "pcre2", "python-pcre", and probably others. Importantly, the correct answer to this *depends on your use-case*. Which regular expression package do you want? *It depends*. So which one or ones should be in this curated list? Just the one most popular? You can easily find that from a simple web search. All of them? Now we're back to the original problem, but with more barriers to entry for any new package, which will have to appeal to be added to the curated list, lest it dwindle, perish, starve, pine, and die. Just some of them? Which ones? And that's for something relatively simple. How about a web framework? Which ones belong in the curated list? I don't think the PSF or PyPA should be in a position of making this list, because it would carry too much weight, too much importance. But if not them, then who? Hence, decentralization. An authoritative source is the easy solution for the reader, but a terrible one for the publisher, and ultimately, isn't a good solution for the reader either. It's not just a matter of how much work it would be. ChrisA
data:image/s3,"s3://crabby-images/a3b9e/a3b9e3c01ce9004917ad5e7689530187eb3ae21c" alt=""
On Sat, Jul 1, 2023 at 10:24 PM Chris Angelico <rosuav@gmail.com> wrote:
I don't think that it -- how about "Just give a few options that aren't worthless" would really help.
I think you've made my point -- who wants to wade through 500 packages? How many of those packages are reasonably well tested and maintained? I'll bet a good fraction of those are essentially worthless. Getting 50 hits would be a lot more manageable -- it doesn't have to be one. Only re (the one in the stdlib)? What if you want PCREs -
there's no package called "pcre" but there's "pcre2", "python-pcre", and probably others.
And are those three (and others?) actually useful maintained packages? Or someone's abandoned experiment? Who the heck knows without digging into each one? NOTE: I did a (very) quick google to see if someone had written a blog about PCREs in Python that might provide some guidance -- no luck. I like your decentralized blog idea, but I'm not sure how to get people to write them :-) -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, 2 Jul 2023 at 15:48, Christopher Barker <pythonchb@gmail.com> wrote:
Would you know that even if they were on some official curated list? Projects get abandoned at any time, and unless the curators are constantly checking every package for currency, it's no different - except that people will expect it to be different.
NOTE: I did a (very) quick google to see if someone had written a blog about PCREs in Python that might provide some guidance -- no luck. I like your decentralized blog idea, but I'm not sure how to get people to write them :-)
Indeed. But I expect it'd be easier to get one person to write one blog post about their favourite packages than to create some sort of comprehensive list of everything that's good :) ChrisA
data:image/s3,"s3://crabby-images/a3b9e/a3b9e3c01ce9004917ad5e7689530187eb3ae21c" alt=""
On Sat, Jul 1, 2023 at 11:01 PM Chris Angelico <rosuav@gmail.com> wrote:
It's quite different -- because at least the package was useful at SOME point. But yes, maintenance would be required - back to the "it would take a lot of work" problem -- I"m not pretending that that's not the case. However, the amount of time the two of us have spent on this thread could have been used to review the status of quite a few packages :-)
of course it would, but then you'd only know about a few packages -- not a great comparison. However, that's why I like the idea of a centralized package review site -- then you get that one person to write one review of the PCRE packages and post a PR. Then a couple hundred such people and you've got something! But critical mass would be hard to get. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, 2 Jul 2023 at 16:36, Christopher Barker <pythonchb@gmail.com> wrote:
And reviewing those few packages would be just as minimal as anything else, because neither of us has any sort of comprehensive knowledge of packages :) But maybe it'd be a start.
However, that's why I like the idea of a centralized package review site -- then you get that one person to write one review of the PCRE packages and post a PR. Then a couple hundred such people and you've got something! But critical mass would be hard to get.
Every review (where a "review" might cover multiple packages) would have to have a date on it, so people know how useful the information is likely to be. Which means your centralized review site is really just a link to a bunch of separate things anyway. ChrisA
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On Sun, 2 Jul 2023 at 06:11, Christopher Barker <pythonchb@gmail.com> wrote:
The OP of this thread is not alone -- folks want an authoritative source
Authority is built over time. Unless someone starts doing this, there will never be any authoritative source. In case it’s not obvious, none of the projects or groups with existing authority have the resource (and/or the interest) to take this on. The best route to leverage existing authority is for someone to start a new project, and then once it has reached a certain level of maturity, apply for PyPA membership. Unfortunately, too much of this discussion is framed as “someone should”, or “it would be good if”. No-one is saying “I will”. Naming groups, like “the PyPA should” doesn’t help either - groups don’t do things, people do. Who in the PyPA? Me? Nope, sorry, I don’t have the time or interest - I’d *use* a curated index, I sure as heck couldn’t *create* one. Paul.
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On Sat, 1 Jul 2023 at 16:41, MRAB <python@mrabarnett.plus.com> wrote:
How about adding ticks to some PyPI packages? :-)
(There's still the problem of who gets to decide which ones get a tick...)
Precisely. This is just curation but having PyPI host the curators' decisions. And we've already established that the PyPI admins don't have the resources to curate themselves, so it still needs someone to do the job of curating, and now they *also* need to establish trust with the PyPI admins, who don't have the time or knowledge to review the curators' choices. It keeps going round the same circle. The bottleneck is still people. Paul
data:image/s3,"s3://crabby-images/a3b9e/a3b9e3c01ce9004917ad5e7689530187eb3ae21c" alt=""
On Sat, Jul 1, 2023 at 8:42 AM MRAB <python@mrabarnett.plus.com> wrote:
(a PyPI build farm, anyone?)
Funny that you should mention that - I mentioned conda-forge as a lightly curated collection. But it didn't start out with that aim in mind. It started as a way for folks to collaborate on building conda packages. Then it turned into a build farm -- which is really it's primary purpose today -- it just so happens that the process ends up providing some curation :-) So maybe we could learn from that -- one problem with running a curated repository is that not only does the curation take work, but how do you get people to contribute to it? until it reaches substantial critical mass, there'd be zero motivation for someone to put their package on there. conda-forge has been successful because from its beginning, and to this day, it's the only way to make a package easily accessible to the conda community.[*]. Perhaps providing a build system would provide that motivation. But there would still be a massive critical mass problem. -CHB [*] Well, we did give out mugs at SciPy to folks that contributed a package -- but somehow I doubt that made a dent :-)
How about adding ticks to some PyPI packages? :-)
I do wonder if a public rating system ("upvote" or "stars" or ??) could work -- I'm very wary of how it could be gamed or otherwise abused, but maybe there are ways of controlling that? We can get stats on how often different packages are downloaded, which is a start, but a way for the community to highlight the cream of the crop, and everyone to see that would be nice, if it worked. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
data:image/s3,"s3://crabby-images/d1d84/d1d8423b45941c63ba15e105c19af0a5e4c41fda" alt=""
Chris Angelico writes:
On Sat, 1 Jul 2023 at 01:15, Christopher Barker <pythonchb@gmail.com> wrote:
Totally different topic, but I do think that a "curated" package repo would be helpful -- there is a lot of cruft on PyPi :-(
Sounds like a "standard library". I understand the difference, but Like Chris A I'm dubious about whether there's really a lane for it, or whether like bike lanes in Japan you'd just find a lot of illegal parking in it. ;-) But if somebody's going to put in effort to review PyPI, I'd really rather see them go after "typo squatters". Most are probably just clout chasers, but we know that some are malware, far more dangerous than merely "cruft". Chris Angelico writes:
I think this is a good way to go, expecially if reviewers link to each other, building community as well as providing package reviews. For example, while my needs are limited enough that I haven't actually tried any of his stuff, I've found Simon Willison's (datasette.io) tweetqs interesting. (datasette itself, of course, and he's tweeted a lot about LLMs recently too, but here I'm referring to his more random tweets about utilities he's discovered or created.) There are also some idiosyncratic curated package collections (the FLUFL packages, etc), as well as a lot of frameworks (Django, the Zope components, lazr) that seem to sprout utility packages regularly. I'm sure there's a lane if somebody wants to go around blogging about all the "stuff" they see. Steve
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sat, 1 Jul 2023 at 18:43, Stephen J. Turnbull <turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
Ouch :) Though I wish more cities could do what Amsterdam's been doing. Especially the city I live in. Would love to see some walkable areas and real viable bike lanes here. The main difference between standard library and "blessed package" would be the update cycle. Standard library modules can only be updated when Python itself is updated; a blessed package (say, for instance, 'requests') can update any time it wants to.
They're definitely working on that.
Yeah. All we need is for people to start ranting on the internet. Can't be THAT hard right? ChrisA
data:image/s3,"s3://crabby-images/d1d84/d1d8423b45941c63ba15e105c19af0a5e4c41fda" alt=""
Chris Angelico writes:
On Sat, 1 Jul 2023 at 18:43, Stephen J. Turnbull <turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
OK, I'll take that as "Steve, report the ones you've found already". :-) (Since they're a project I work on I feel like I should give an expert opinion, rather than just a "I smell fish" email. :-)
Yeah. All we need is for people to start ranting on the internet. Can't be THAT hard right?
Good rants are hard to find! Steve
participants (8)
-
Andre Delfino
-
Barry Scott
-
Chris Angelico
-
Christopher Barker
-
Dom Grigonis
-
MRAB
-
Paul Moore
-
Stephen J. Turnbull