Average number style errors on Python

Hi guys i currently searching for some paper or note that has how is the average number of style errors on python. If someone know about this kind of work i will be pleased -- Juan BC (from phone)

I personally don't, but I'd be interested if you ended up finding something. On Saturday, June 11, 2016, Juan BC <jbc.develop@gmail.com> wrote:
Hi guys i currently searching for some paper or note that has how is the average number of style errors on python. If someone know about this kind of work i will be pleased --
Juan BC (from phone)
-- ~ Ian Lee | IanLee1521@gmail.com

Thanks for the quick reply. I think i gonna retrieve a lots of public gists and store it into dataframe. Also i think to retrieve some uknow python packages. Mostly because the famous ones are really good in quality On Sat, 11 Jun 2016 at 13:24 Ian Lee <ianlee1521@gmail.com> wrote:
I personally don't, but I'd be interested if you ended up finding something.
On Saturday, June 11, 2016, Juan BC <jbc.develop@gmail.com> wrote:
Hi guys i currently searching for some paper or note that has how is the average number of style errors on python. If someone know about this kind of work i will be pleased --
Juan BC (from phone)
--
~ Ian Lee | IanLee1521@gmail.com
--
Juan BC (from phone)

On Sat, Jun 11, 2016 at 1:32 PM, Juan BC <jbc.develop@gmail.com> wrote:
Thanks for the quick reply.
I think i gonna retrieve a lots of public gists and store it into dataframe. Also i think to retrieve some uknow python packages. Mostly because the famous ones are really good in quality
That's actually not entirely the case. I can think of a couple "famous" or popular packages that violate PEP-0008 quite a lot because the creator dislikes the style guide.

Even the Cpython standard library has quite a few violations.. On Saturday, June 11, 2016, Ian Cordasco <graffatcolmingov@gmail.com> wrote:
On Sat, Jun 11, 2016 at 1:32 PM, Juan BC <jbc.develop@gmail.com <javascript:;>> wrote:
Thanks for the quick reply.
I think i gonna retrieve a lots of public gists and store it into dataframe. Also i think to retrieve some uknow python packages. Mostly because the famous ones are really good in quality
That's actually not entirely the case. I can think of a couple "famous" or popular packages that violate PEP-0008 quite a lot because the creator dislikes the style guide.
-- ~ Ian Lee | IanLee1521@gmail.com

True. But if i measure a big project i gonna measure the tolerance of some community to the pep8 errors. for example yesterday i running flake8 into astropy repository and they has 11461 errors, but in the scikit-learn case they has only 5. On Sat, 11 Jun 2016 at 23:19 Ian Lee <ianlee1521@gmail.com> wrote:
Even the Cpython standard library has quite a few violations..
On Saturday, June 11, 2016, Ian Cordasco <graffatcolmingov@gmail.com> wrote:
On Sat, Jun 11, 2016 at 1:32 PM, Juan BC <jbc.develop@gmail.com> wrote:
Thanks for the quick reply.
I think i gonna retrieve a lots of public gists and store it into dataframe. Also i think to retrieve some uknow python packages. Mostly because the famous ones are really good in quality
That's actually not entirely the case. I can think of a couple "famous" or popular packages that violate PEP-0008 quite a lot because the creator dislikes the style guide.
--
~ Ian Lee | IanLee1521@gmail.com
--
Juan BC (from phone)

When i publish the result i gonna free the dataset i currently creating On Sun, 12 Jun 2016 at 13:31 Juan BC <jbc.develop@gmail.com> wrote:
True.
But if i measure a big project i gonna measure the tolerance of some community to the pep8 errors. for example yesterday i running flake8 into astropy repository and they has 11461 errors, but in the scikit-learn case they has only 5.
On Sat, 11 Jun 2016 at 23:19 Ian Lee <ianlee1521@gmail.com> wrote:
Even the Cpython standard library has quite a few violations..
On Saturday, June 11, 2016, Ian Cordasco <graffatcolmingov@gmail.com> wrote:
On Sat, Jun 11, 2016 at 1:32 PM, Juan BC <jbc.develop@gmail.com> wrote:
Thanks for the quick reply.
I think i gonna retrieve a lots of public gists and store it into dataframe. Also i think to retrieve some uknow python packages. Mostly because the famous ones are really good in quality
That's actually not entirely the case. I can think of a couple "famous" or popular packages that violate PEP-0008 quite a lot because the creator dislikes the style guide.
--
~ Ian Lee | IanLee1521@gmail.com
--
Juan BC (from phone)
-- Juan BC (from phone)

You should keep in mind that these projects don't all use Flake8. Astropy uses only pep8/pycodestyle and scikit-learn uses landscape.io. Simply running flake8 on projects won't be sufficient. You'll have to take into account what style means to each project if you're going to identify the number of style errors. On Sun, Jun 12, 2016 at 11:31 AM, Juan BC <jbc.develop@gmail.com> wrote:
True.
But if i measure a big project i gonna measure the tolerance of some community to the pep8 errors. for example yesterday i running flake8 into astropy repository and they has 11461 errors, but in the scikit-learn case they has only 5.
On Sat, 11 Jun 2016 at 23:19 Ian Lee <ianlee1521@gmail.com> wrote:
Even the Cpython standard library has quite a few violations..
On Saturday, June 11, 2016, Ian Cordasco <graffatcolmingov@gmail.com> wrote:
On Sat, Jun 11, 2016 at 1:32 PM, Juan BC <jbc.develop@gmail.com> wrote:
Thanks for the quick reply.
I think i gonna retrieve a lots of public gists and store it into dataframe. Also i think to retrieve some uknow python packages. Mostly because the famous ones are really good in quality
That's actually not entirely the case. I can think of a couple "famous" or popular packages that violate PEP-0008 quite a lot because the creator dislikes the style guide.
--
~ Ian Lee | IanLee1521@gmail.com
--
Juan BC (from phone)

as i say before, i am no interested in how is a quality of a project based on the tolerance of their own community. Mi interest is the quality based on a random sample of code from a random sample of programmers. I use flake because my index of quality is based also on flake8. On Sun, 12 Jun 2016 at 14:31 Ian Cordasco <graffatcolmingov@gmail.com> wrote:
You should keep in mind that these projects don't all use Flake8.
Astropy uses only pep8/pycodestyle and scikit-learn uses landscape.io. Simply running flake8 on projects won't be sufficient. You'll have to take into account what style means to each project if you're going to identify the number of style errors.
On Sun, Jun 12, 2016 at 11:31 AM, Juan BC <jbc.develop@gmail.com> wrote:
True.
But if i measure a big project i gonna measure the tolerance of some community to the pep8 errors. for example yesterday i running flake8 into astropy repository and they has 11461 errors, but in the scikit-learn case they has only 5.
On Sat, 11 Jun 2016 at 23:19 Ian Lee <ianlee1521@gmail.com> wrote:
Even the Cpython standard library has quite a few violations..
On Saturday, June 11, 2016, Ian Cordasco <graffatcolmingov@gmail.com> wrote:
On Sat, Jun 11, 2016 at 1:32 PM, Juan BC <jbc.develop@gmail.com>
wrote:
Thanks for the quick reply.
I think i gonna retrieve a lots of public gists and store it into dataframe. Also i think to retrieve some uknow python packages. Mostly because the famous ones are really good in quality
That's actually not entirely the case. I can think of a couple "famous" or popular packages that violate PEP-0008 quite a lot because the creator dislikes the style guide.
--
~ Ian Lee | IanLee1521@gmail.com
--
Juan BC (from phone)
-- Juan BC (from phone)

* Juan BC <jbc.develop@gmail.com> [2016-06-12 19:00:23 +0000]:
as i say before, i am no interested in how is a quality of a project based on the tolerance of their own community.
Mi interest is the quality based on a random sample of code from a random sample of programmers. I use flake because my index of quality is based also on flake8.
Using flake8 errors as a measurement of quality is flawed to begin with, in my opinion. As Ian said, coding style is subjective. If a project has their own style guide instead of following pep8 (maybe because it was created before pep8 was widespread, like Twisted I believe, or many parts of the stdlib), does it have a worse quality? I don't believe so. Even pep8 says: Many projects have their own coding style guidelines. In the event of any conflicts, such project-specific guides take precedence for that project. A style guide is about consistency. Consistency with this style guide is important. Consistency within a project is more important. Consistency within one module or function is the most important. [...] Some other good reasons to ignore a particular guideline: [...] - Because the code in question predates the introduction of the guideline and there is no other reason to be modifying that code. Florian -- http://www.the-compiler.org | me@the-compiler.org (Mail/XMPP) GPG: 916E B0C8 FD55 A072 | http://the-compiler.org/pubkey.asc I love long mails! | http://email.is-not-s.ms/

On Sun, Jun 12, 2016 at 2:00 PM, Juan BC <jbc.develop@gmail.com> wrote:
as i say before, i am no interested in how is a quality of a project based on the tolerance of their own community.
You didn't say that actually.
Mi interest is the quality based on a random sample of code from a random sample of programmers. I use flake because my index of quality is based also on flake8.
I think you need to watch https://www.youtube.com/watch?v=wf-BqAjZb8M Following the letter of the document isn't following the spirit and the quality of a code base can be measured on far more than one axis. Quality of a code base should also be determined by whether it uses obvious security problems (as can be caught via bandit) and other logical errors (as can be caught by pylint). Quality can be further measured by the tests a project has. I urge you to reconsider how you're measuring "quality". Instead of calling it quality, please consider calling it "style conformance". They are very different things.

sorry i think this is an steryl discussion. 1. I see this video before and i never say that my quality index is ONLY style. (and also all your recomendation are already inside the another parts of the equation) I am not offended or angry, but because how the (F%$$#%) scientific publication works i can't tell you everything about the research before the journal make their decision about it. tanks for the comments. PS: i gonna free the dataset in the near future. On Sun, 12 Jun 2016 at 17:18 Ian Cordasco <graffatcolmingov@gmail.com> wrote:
as i say before, i am no interested in how is a quality of a project
On Sun, Jun 12, 2016 at 2:00 PM, Juan BC <jbc.develop@gmail.com> wrote: based
on the tolerance of their own community.
You didn't say that actually.
Mi interest is the quality based on a random sample of code from a random sample of programmers. I use flake because my index of quality is based also on flake8.
I think you need to watch https://www.youtube.com/watch?v=wf-BqAjZb8M
Following the letter of the document isn't following the spirit and the quality of a code base can be measured on far more than one axis. Quality of a code base should also be determined by whether it uses obvious security problems (as can be caught via bandit) and other logical errors (as can be caught by pylint). Quality can be further measured by the tests a project has.
I urge you to reconsider how you're measuring "quality". Instead of calling it quality, please consider calling it "style conformance". They are very different things.
-- Juan BC (from phone)
participants (4)
-
Florian Bruhin
-
Ian Cordasco
-
Ian Lee
-
Juan BC