[Python-ideas] stdlib crowdsourcing

Yuval Greenfield ubershmekel at gmail.com
Tue Jun 12 18:11:11 CEST 2012


On Tue, May 29, 2012 at 8:05 AM, anatoly techtonik <techtonik at gmail.com>wrote:

> The problem with stdlib - it is all damn subjective. There is no
> process to add functions and modules if you're not well-behaved and
> skilled in public debates and don't have really a lot of time to be a
> champion of your module/function. In other words - it is hard (if not
> impossible for 80% of Python Earth population). So, many people and
> projects decide to opt-out. Take a look at Twisted - a lot of useful
> stuff, but not in Python stdlib. So..
>
> Provide a way for people to opt-out from core stuff, but still allow
> to share the changes and update code if necessary.
>
> This will require:
> - a local stdlib Python path convention
> - snippet normalization function and AST hash dumper
> - web site with stats
> - source code crawler
>
> How it works:
> 1. Every project maintains its own stdlib directory with functions
> that they feel are good to have in standard library
> 2. Functions are placed so that they are imported as if from standard
> library, but this time with stdlib prefix
> 3. The license for this directory is public domain to remove all legal
> barriers (credits are welcome, but optional)
> 4. Crawler (probably PyPI) scans this stdlib dir, finds functions,
> normalizes them, calculates hash and submits to web site
>  4.1 Normalization is required to find the shared function
> copy/pasted across different projects with different
>        indentation level, docstrings, parameters/variable names etc.
>  4.2 Hash is calculated upon AST. There are at least three hashes for
> each entry:
>       4.2.1 Full hash - all docstrings and variable names are
> preserved, whitespace normalized
>       4.2.2 Stripped hash - docstrings are stripped, variable names
> are normalized
>       4.2.3 Signature hash - a mark placed in a comment above
> function name, either calculated from function
>                signature or generated randomly, used for manual
> tracking of copy/paste e.g. pd:ac546df6b8340a92
> 5. Web site maintains usage and popularity staff, accepts votes on
> inclusion of snippets
>
>
> User stories:
> 1. "I want to find if there is a better/updated version of my function
> available"
>   1.1  I enter hash into web site search form
>   1.2  Site gives me a link to my snippet
>   1.3  I can see what people proposed to replace this function with
>   1.4  I can choose the function with most votes
>   1.5  I can flag the functions I may find irrelevant or
>   1.5  I can tag the functions that divert in different direction
> than I need to filter them
>
> 2. "I want to reuse code snippets without additional dependencies on
> 3rd party projects"
>   1.1  Just place them into my own stdlib directory
>
> 3. "I want to update code snippets when there is an update for them"
>   1.1  I run scanner, it extracts signature hashes, stripped hashes
> and looks if web-site version of signature matches normalized hash
>
> 4. "I want to see what people want to include in the next Python version"
>   1.1  A call for proposals is made
>   1.2  People place wannabe's into their stdlib dirs
>   1.3  Crawl generates new functions on a web site
>   1.4  Functions are categorized
>   1.5  Optionally included / declined with a short one-liner reason - why
>   1.6  Optionally provided with more detailed info why
>
> --- feature creep cut ---
> 5. "I want to see what functions are popular in other languages"
>   1.1  A separate crawler for Ruby, PHP etc. stdlib converts their
> AST into compatible format where possible
>   1.2  Submit to site stats
>
> 6. "I want to download the function in Ruby format"
>   1.1  AST converter tries to do the job automatically where possible
>   1.2  If it fails - you are encouraged to fix the converter rules or
> write the replacement for this signature manually
>
>
> Just an idea.
> --
> anatoly t.
>


I think having a separate site "anatloy's std-lib" which somehow
implemented an easy install of the top 10-100 most useful/popular/selected
packages on pypi could be nice. I considered making such a bundle myself a
while ago.

I don't think it really needs to be python.org sanctioned.

Yuval

PS I like how candid the replies you got were, and indeed getting a reply
is better than the sound of crickets. Though some of these replies carried
the scent of excrement poredom - the author's need to import niceness.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120612/1800dc21/attachment.html>


More information about the Python-ideas mailing list