[Python-ideas] stdlib crowdsourcing
Yuval Greenfield
ubershmekel at gmail.com
Tue Jun 12 18:11:11 CEST 2012
On Tue, May 29, 2012 at 8:05 AM, anatoly techtonik <techtonik at gmail.com>wrote:
> The problem with stdlib - it is all damn subjective. There is no
> process to add functions and modules if you're not well-behaved and
> skilled in public debates and don't have really a lot of time to be a
> champion of your module/function. In other words - it is hard (if not
> impossible for 80% of Python Earth population). So, many people and
> projects decide to opt-out. Take a look at Twisted - a lot of useful
> stuff, but not in Python stdlib. So..
>
> Provide a way for people to opt-out from core stuff, but still allow
> to share the changes and update code if necessary.
>
> This will require:
> - a local stdlib Python path convention
> - snippet normalization function and AST hash dumper
> - web site with stats
> - source code crawler
>
> How it works:
> 1. Every project maintains its own stdlib directory with functions
> that they feel are good to have in standard library
> 2. Functions are placed so that they are imported as if from standard
> library, but this time with stdlib prefix
> 3. The license for this directory is public domain to remove all legal
> barriers (credits are welcome, but optional)
> 4. Crawler (probably PyPI) scans this stdlib dir, finds functions,
> normalizes them, calculates hash and submits to web site
> 4.1 Normalization is required to find the shared function
> copy/pasted across different projects with different
> indentation level, docstrings, parameters/variable names etc.
> 4.2 Hash is calculated upon AST. There are at least three hashes for
> each entry:
> 4.2.1 Full hash - all docstrings and variable names are
> preserved, whitespace normalized
> 4.2.2 Stripped hash - docstrings are stripped, variable names
> are normalized
> 4.2.3 Signature hash - a mark placed in a comment above
> function name, either calculated from function
> signature or generated randomly, used for manual
> tracking of copy/paste e.g. pd:ac546df6b8340a92
> 5. Web site maintains usage and popularity staff, accepts votes on
> inclusion of snippets
>
>
> User stories:
> 1. "I want to find if there is a better/updated version of my function
> available"
> 1.1 I enter hash into web site search form
> 1.2 Site gives me a link to my snippet
> 1.3 I can see what people proposed to replace this function with
> 1.4 I can choose the function with most votes
> 1.5 I can flag the functions I may find irrelevant or
> 1.5 I can tag the functions that divert in different direction
> than I need to filter them
>
> 2. "I want to reuse code snippets without additional dependencies on
> 3rd party projects"
> 1.1 Just place them into my own stdlib directory
>
> 3. "I want to update code snippets when there is an update for them"
> 1.1 I run scanner, it extracts signature hashes, stripped hashes
> and looks if web-site version of signature matches normalized hash
>
> 4. "I want to see what people want to include in the next Python version"
> 1.1 A call for proposals is made
> 1.2 People place wannabe's into their stdlib dirs
> 1.3 Crawl generates new functions on a web site
> 1.4 Functions are categorized
> 1.5 Optionally included / declined with a short one-liner reason - why
> 1.6 Optionally provided with more detailed info why
>
> --- feature creep cut ---
> 5. "I want to see what functions are popular in other languages"
> 1.1 A separate crawler for Ruby, PHP etc. stdlib converts their
> AST into compatible format where possible
> 1.2 Submit to site stats
>
> 6. "I want to download the function in Ruby format"
> 1.1 AST converter tries to do the job automatically where possible
> 1.2 If it fails - you are encouraged to fix the converter rules or
> write the replacement for this signature manually
>
>
> Just an idea.
> --
> anatoly t.
>
I think having a separate site "anatloy's std-lib" which somehow
implemented an easy install of the top 10-100 most useful/popular/selected
packages on pypi could be nice. I considered making such a bundle myself a
while ago.
I don't think it really needs to be python.org sanctioned.
Yuval
PS I like how candid the replies you got were, and indeed getting a reply
is better than the sound of crickets. Though some of these replies carried
the scent of excrement poredom - the author's need to import niceness.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120612/1800dc21/attachment.html>
More information about the Python-ideas
mailing list