Hello, this is an update on my work and the current status of Coverity Scan. Maybe you have noticed a checkins made be me that end with the line "CID #". These are checkins that fix an issue that was discovered by the static code analyzer Coverity. Coverity is a commercial product but it's a free service for some Open Source projects. Python has been analyzed by Coverity since about 2007. Guido, Neal, Brett, Stefan and some other developers have used Coverity before I took over. I fixed a couple of issues before 3.3 reached the RC phase and more bugs in the last couple of months. Coverity is really great and its web GUI is fun to use, too. I was able to identify and fix resource leaks, NULL pointer issues, buffer overflows and missing checks all over the place. Because it's a static analyzer that follows data-flows and control-flows the tool can detect issues in error paths that are hardly visited at all. I have started to document Coverity here: http://docs.python.org/devguide/coverity.html Interview --------- A week ago I was contacted by Coverity. They have started a series of articles and press releases about Open Source projects that use their free service Coverity Scan, see http://www.coverity.com/company/press-releases/read/coverity-introduces-mont... Two days ago I had a lovely phone interview about my involvement in the Python project and our development style. They are going to release a nice article in a couple of weeks. In the mean time we have time to fix the remaining couple issues. We *might* be able to reach the highest coverity integrity level! I have dealt with all major issues so we just have to fix a couple of issues. Current stats ------------- Lines of Code: 396,179 Defect Density: 0.05 Total defects: 1,054 Outstanding: 21 (Coverity Connect shows less) Dismissed: 222 Fixed: 811 http://i.imgur.com/NoELjcj.jpg http://i.imgur.com/eJSzTUX.jpg open issues ----------- http://bugs.python.org/issue17899 http://bugs.python.org/issue18556 http://bugs.python.org/issue18555 http://bugs.python.org/issue18552 http://bugs.python.org/issue18551 http://bugs.python.org/issue18550 http://bugs.python.org/issue18528 Christian
On 7/25/2013 2:48 PM, Christian Heimes wrote:
Hello,
this is an update on my work and the current status of Coverity Scan.
Great work.
Maybe you have noticed a checkins made be me that end with the line "CID #". These are checkins that fix an issue that was discovered by the static code analyzer Coverity. Coverity is a commercial product but it's a free service for some Open Source projects. Python has been analyzed by Coverity since about 2007. Guido, Neal, Brett, Stefan and some other developers have used Coverity before I took over. I fixed a couple of issues before 3.3 reached the RC phase and more bugs in the last couple of months.
The benefit for us is not just improving Python having external verification of its excellence in relation both to other open-source projects and commercial software.
Coverity is really great and its web GUI is fun to use, too. I was able to identify and fix resource leaks, NULL pointer issues, buffer overflows and missing checks all over the place. Because it's a static analyzer that follows data-flows and control-flows the tool can detect issues in error paths that are hardly visited at all. I have started to document Coverity here:
http://docs.python.org/devguide/coverity.html
Interview ---------
A week ago I was contacted by Coverity. They have started a series of articles and press releases about Open Source projects that use their free service Coverity Scan, see
http://www.coverity.com/company/press-releases/read/coverity-introduces-mont...
The intention is to promote the best of open source to industry.
Two days ago I had a lovely phone interview about my involvement in the Python project and our development style. They are going to release a nice article in a couple of weeks. In the mean time we have time to fix the remaining couple issues. We *might* be able to reach the highest coverity integrity level! I have dealt with all major issues so we just have to fix a couple of issues.
Current stats -------------
Lines of Code: 396,179
C only? or does Python code now count as 'source code'?
Defect Density: 0.05
= defects per thousand lines = 20/400 Anything under 1 is good. The release above reports Samba now at .6. http://www.pcworld.com/article/2038244/linux-code-is-the-benchmark-of-qualit... reports Linux 3.8 as having the same for 7.6 million lines.
Total defects: 1,054 Outstanding: 21 (Coverity Connect shows less) Dismissed: 222
This implies that they accept our designation of some things as False Positives or Intentional. Does Coverity do any review of such designations, so a project cannot cheat?
Fixed: 811
http://i.imgur.com/NoELjcj.jpg http://i.imgur.com/eJSzTUX.jpg
open issues -----------
http://bugs.python.org/issue17899 http://bugs.python.org/issue18556 http://bugs.python.org/issue18555 http://bugs.python.org/issue18552 http://bugs.python.org/issue18551 http://bugs.python.org/issue18550 http://bugs.python.org/issue18528
-- Terry Jan Reedy
On 7/25/2013 6:00 PM, Terry Reedy wrote:
Defect Density: 0.05
= defects per thousand lines = 20/400
Anything under 1 is good. The release above reports Samba now at .6. http://www.pcworld.com/article/2038244/linux-code-is-the-benchmark-of-qualit...
reports Linux 3.8 as having the same for 7.6 million lines.
Total defects: 1,054 Outstanding: 21 (Coverity Connect shows less) Dismissed: 222
This implies that they accept our designation of some things as False Positives or Intentional. Does Coverity do any review of such designations, so a project cannot cheat?
I found the answer here https://docs.google.com/file/d/0B5wQCOK_TiRiMWVqQ0xPaDEzbkU/edit Coverity Integrity Level 1 is 1 (defect/1000 lines) Level 2 is .1 (we have passed that) Level 3 is .01 + no major defects + <20% (all all defects?) false positives as that is their normal rate.# A higher false positive rates requires auditing by Coverity. They claim "A higher false positive rate indicates misconfiguration, usage of unusual idioms, or incorrect diagnosis of a large number of defects." They else add "or a flaw in our analysis." # Since false positives should stay constant as true positives are reduced toward 0, false / all should tend toward 1 (100%) if I understand the ratio correctly.
Fixed: 811
http://i.imgur.com/NoELjcj.jpg http://i.imgur.com/eJSzTUX.jpg
-- Terry Jan Reedy
Am 26.07.2013 00:32, schrieb Terry Reedy:
I found the answer here https://docs.google.com/file/d/0B5wQCOK_TiRiMWVqQ0xPaDEzbkU/edit Coverity Integrity Level 1 is 1 (defect/1000 lines) Level 2 is .1 (we have passed that) Level 3 is .01 + no major defects + <20% (all all defects?) false positives as that is their normal rate.#
A higher false positive rates requires auditing by Coverity. They claim "A higher false positive rate indicates misconfiguration, usage of unusual idioms, or incorrect diagnosis of a large number of defects." They else add "or a flaw in our analysis."
# Since false positives should stay constant as true positives are reduced toward 0, false / all should tend toward 1 (100%) if I understand the ratio correctly.
About 40% of the dismissed cases are cause by a handful of issues. I have documented these issues as "known limitations" http://docs.python.org/devguide/coverity.html#known-limitations . For example about 35 false positives are related to PyLong_FromLong() and our small integer optimization. A correct modeling file would eliminate the false positive defects. My attempts don't work as hoped and I don't have access to all professional coverity tools to debug my trials. Nearly 20 false positives are caused by Py_BuildValue("N"). I'm still astonished that Coverity understands Python's reference counting most of the time. :) Did I mention that we have almost reached Level 3? All major defects have been dealt with (one of them locally on the test machine until Larry pushes his patch soonish), 4 of 7 minor issues must be closed and our dismissed rate is just little over 20% (222 out of 1054 = 21%). Christian
On 7/25/2013 6:56 PM, Christian Heimes wrote:
Am 26.07.2013 00:32, schrieb Terry Reedy:
# Since false positives should stay constant as true positives are reduced toward 0, false / all should tend toward 1 (100%) if I understand the ratio correctly.
Which I did not ;-).
About 40% of the dismissed cases are cause by a handful of issues. I have documented these issues as "known limitations" http://docs.python.org/devguide/coverity.html#known-limitations .
For example about 35 false positives are related to PyLong_FromLong() and our small integer optimization. A correct modeling file would eliminate the false positive defects. My attempts don't work as hoped and I don't have access to all professional coverity tools to debug my trials.
Perhaps Coverity will help when doing an audit.
Nearly 20 false positives are caused by Py_BuildValue("N"). I'm still astonished that Coverity understands Python's reference counting most of the time. :)
Did I mention that we have almost reached Level 3? All major defects
It is hard to measure the benefit of preventitive medicine, but I imagine that we should see fewer mysterious crashes and heisenbugs than we would have. In any case, Level 3 certification should help people promoting the use of Python in organizational settings, whether as employees or consultants.
have been dealt with (one of them locally on the test machine until Larry pushes his patch soonish), 4 of 7 minor issues must be closed and
.1 * 390 allows 3 defects (or 4 if they round up) -- astonishingly good!
our dismissed rate is just little over 20% (222 out of 1054 = 21%).
So merely verifying the 35 PyLong_FromLong dismissals will put us under. Thanks for clarifying the proper denominator -- all defects ever found. It seems obvious in retrospect, but I was focused on current stats, not the history. -- Terry Jan Reedy
On Thu, Jul 25, 2013 at 6:56 PM, Christian Heimes
Am 26.07.2013 00:32, schrieb Terry Reedy:
I found the answer here https://docs.google.com/file/d/0B5wQCOK_TiRiMWVqQ0xPaDEzbkU/edit Coverity Integrity Level 1 is 1 (defect/1000 lines) Level 2 is .1 (we have passed that) Level 3 is .01 + no major defects + <20% (all all defects?) false positives as that is their normal rate.#
A higher false positive rates requires auditing by Coverity. They claim "A higher false positive rate indicates misconfiguration, usage of unusual idioms, or incorrect diagnosis of a large number of defects." They else add "or a flaw in our analysis."
# Since false positives should stay constant as true positives are reduced toward 0, false / all should tend toward 1 (100%) if I understand the ratio correctly.
About 40% of the dismissed cases are cause by a handful of issues. I have documented these issues as "known limitations" http://docs.python.org/devguide/coverity.html#known-limitations .
For example about 35 false positives are related to PyLong_FromLong() and our small integer optimization. A correct modeling file would eliminate the false positive defects. My attempts don't work as hoped and I don't have access to all professional coverity tools to debug my trials.
Have you tried asking for help from Coverity? They have been rather nice so far and they may be willing to just give us free help in getting the modeling file set up properly. -Brett
Nearly 20 false positives are caused by Py_BuildValue("N"). I'm still astonished that Coverity understands Python's reference counting most of the time. :)
Did I mention that we have almost reached Level 3? All major defects have been dealt with (one of them locally on the test machine until Larry pushes his patch soonish), 4 of 7 minor issues must be closed and our dismissed rate is just little over 20% (222 out of 1054 = 21%).
Am 26.07.2013 16:29, schrieb Brett Cannon:
Have you tried asking for help from Coverity? They have been rather nice so far and they may be willing to just give us free help in getting the modeling file set up properly.
Yes, I'm in contact with Dakshesh. I was able to figure out one model for a false positive on my own. Dakshesh is helping me with another. Christian
On Thu, 25 Jul 2013 18:00:55 -0400
Terry Reedy
On 7/25/2013 2:48 PM, Christian Heimes wrote:
Hello,
this is an update on my work and the current status of Coverity Scan.
Great work.
Maybe you have noticed a checkins made be me that end with the line "CID #". These are checkins that fix an issue that was discovered by the static code analyzer Coverity. Coverity is a commercial product but it's a free service for some Open Source projects. Python has been analyzed by Coverity since about 2007. Guido, Neal, Brett, Stefan and some other developers have used Coverity before I took over. I fixed a couple of issues before 3.3 reached the RC phase and more bugs in the last couple of months.
The benefit for us is not just improving Python having external verification of its excellence in relation both to other open-source projects and commercial software.
"Excellence"? The term is too weak, I would say "perfection" at least, but perhaps we should go as far as "divinity". Regards Antoine.
Am 26.07.2013 00:00, schrieb Terry Reedy:
http://www.coverity.com/company/press-releases/read/coverity-introduces-mont...
The intention is to promote the best of open source to industry.
I think it's also a marketing tool. They like to sell their product. I don't have a problem with that. After all Coverity provides a useful service for free that supplements our own debugging tools.
Lines of Code: 396,179
C only? or does Python code now count as 'source code'?
It's just C code and headers. Coverity doesn't analyze Python code. According to cloc Python has 296707 + 78126 == 374833 lines of code in C and header files. I'm not sure why Coverity detects more.
Defect Density: 0.05
= defects per thousand lines = 20/400
Anything under 1 is good. The release above reports Samba now at .6. http://www.pcworld.com/article/2038244/linux-code-is-the-benchmark-of-qualit...
reports Linux 3.8 as having the same for 7.6 million lines.
These are amazing numbers. Python is much smaller.
Total defects: 1,054 Outstanding: 21 (Coverity Connect shows less) Dismissed: 222
This implies that they accept our designation of some things as False Positives or Intentional. Does Coverity do any review of such designations, so a project cannot cheat?
What's the point of cheating? :) I could dismiss any remaining defect as intentionally or false positive but that would only harm ourselves. As you already pointed out Coverity reserves the right to inspect dismissed bugs for their highest ranking. I'm in the process of looking through all dismissed defects. Some of them are relics of deleted files and removed code. Some other may go away with proper modeling. Christian
Just a quick question - is there a chance to convince Coverity to detect
Python refcounting leaks in C API code :-) ? This could be useful not only
for Python but for extensions too. As it stands now, Coverity's leak
detection is Python must be pretty weak because almost everything is done
via PyObject refcounts.
Eli
On Thu, Jul 25, 2013 at 11:48 AM, Christian Heimes
Hello,
this is an update on my work and the current status of Coverity Scan.
Maybe you have noticed a checkins made be me that end with the line "CID #". These are checkins that fix an issue that was discovered by the static code analyzer Coverity. Coverity is a commercial product but it's a free service for some Open Source projects. Python has been analyzed by Coverity since about 2007. Guido, Neal, Brett, Stefan and some other developers have used Coverity before I took over. I fixed a couple of issues before 3.3 reached the RC phase and more bugs in the last couple of months.
Coverity is really great and its web GUI is fun to use, too. I was able to identify and fix resource leaks, NULL pointer issues, buffer overflows and missing checks all over the place. Because it's a static analyzer that follows data-flows and control-flows the tool can detect issues in error paths that are hardly visited at all. I have started to document Coverity here:
http://docs.python.org/devguide/coverity.html
Interview ---------
A week ago I was contacted by Coverity. They have started a series of articles and press releases about Open Source projects that use their free service Coverity Scan, see
http://www.coverity.com/company/press-releases/read/coverity-introduces-mont...
Two days ago I had a lovely phone interview about my involvement in the Python project and our development style. They are going to release a nice article in a couple of weeks. In the mean time we have time to fix the remaining couple issues. We *might* be able to reach the highest coverity integrity level! I have dealt with all major issues so we just have to fix a couple of issues.
Current stats -------------
Lines of Code: 396,179 Defect Density: 0.05 Total defects: 1,054 Outstanding: 21 (Coverity Connect shows less) Dismissed: 222 Fixed: 811
http://i.imgur.com/NoELjcj.jpg http://i.imgur.com/eJSzTUX.jpg
open issues -----------
http://bugs.python.org/issue17899 http://bugs.python.org/issue18556 http://bugs.python.org/issue18555 http://bugs.python.org/issue18552 http://bugs.python.org/issue18551 http://bugs.python.org/issue18550 http://bugs.python.org/issue18528
Christian
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/eliben%40gmail.com
Am 26.07.2013 14:56, schrieb Eli Bendersky:
Just a quick question - is there a chance to convince Coverity to detect Python refcounting leaks in C API code :-) ? This could be useful not only for Python but for extensions too. As it stands now, Coverity's leak detection is Python must be pretty weak because almost everything is done via PyObject refcounts.
Coverity is able to detect some cases of refcount leaks. I don't know if the software is able to keep track of all reference counts. But it understands missing Py_DECREF() in error branches. For example: PyObject *n = PyLong_FromLong(0); PyObject *u = PyUnicode_FromString("example"); if (u == NULL) { return NULL; /* Coverity detects that 'n' leaks memory */ } Christian
Le Fri, 26 Jul 2013 16:29:59 +0200,
Christian Heimes
Coverity is able to detect some cases of refcount leaks. I don't know if the software is able to keep track of all reference counts. But it understands missing Py_DECREF() in error branches.
For example:
PyObject *n = PyLong_FromLong(0); PyObject *u = PyUnicode_FromString("example");
if (u == NULL) { return NULL; /* Coverity detects that 'n' leaks memory */ }
But 'n' doesn't leak memory since PyLong_FromLong(0) is statically allocated ;-) More generally, in similar cases (e.g. replace "0" with a non-small integer), you don't need any knowledge of reference counts to infer that there is a memory leak. When the code discards the only existing pointer to a heap-allocated memory area, there's a leak. What we call "refcount leaks" is generally when an area is still pointer-accessible, but failure to decrement the reference count appropriately means it will never be released. Regards Antoine.
On Fri, Jul 26, 2013 at 7:29 AM, Christian Heimes
Am 26.07.2013 14:56, schrieb Eli Bendersky:
Just a quick question - is there a chance to convince Coverity to detect Python refcounting leaks in C API code :-) ? This could be useful not only for Python but for extensions too. As it stands now, Coverity's leak detection is Python must be pretty weak because almost everything is done via PyObject refcounts.
Coverity is able to detect some cases of refcount leaks. I don't know if the software is able to keep track of all reference counts. But it understands missing Py_DECREF() in error branches.
For example:
PyObject *n = PyLong_FromLong(0); PyObject *u = PyUnicode_FromString("example");
if (u == NULL) { return NULL; /* Coverity detects that 'n' leaks memory */ }
Interesting. I was thinking of something more general though. Especially if we can mark function arguments and return values as stealing references / creating new ones / etc, many many common refcount bugs can be detected with static analysis. This is definitely research-y, probably too much for our current stage of relationship with Coverity :) Eli
On Fri, Jul 26, 2013 at 8:56 AM, Eli Bendersky
Just a quick question - is there a chance to convince Coverity to detect Python refcounting leaks in C API code :-) ?
You can always ask. =)
This could be useful not only for Python but for extensions too. As it stands now, Coverity's leak detection is Python must be pretty weak because almost everything is done via PyObject refcounts.
Just an FYI (mostly for others since I think Eli was at PyCon in the relevant talk), David Malcolm has his work with gcc and refleak detection. But yes, it would be nice if it was in Coverity as it would then be part of the daily check. -Brett
Eli
On Thu, Jul 25, 2013 at 11:48 AM, Christian Heimes
wrote: Hello,
this is an update on my work and the current status of Coverity Scan.
Maybe you have noticed a checkins made be me that end with the line "CID #". These are checkins that fix an issue that was discovered by the static code analyzer Coverity. Coverity is a commercial product but it's a free service for some Open Source projects. Python has been analyzed by Coverity since about 2007. Guido, Neal, Brett, Stefan and some other developers have used Coverity before I took over. I fixed a couple of issues before 3.3 reached the RC phase and more bugs in the last couple of months.
Coverity is really great and its web GUI is fun to use, too. I was able to identify and fix resource leaks, NULL pointer issues, buffer overflows and missing checks all over the place. Because it's a static analyzer that follows data-flows and control-flows the tool can detect issues in error paths that are hardly visited at all. I have started to document Coverity here:
http://docs.python.org/devguide/coverity.html
Interview ---------
A week ago I was contacted by Coverity. They have started a series of articles and press releases about Open Source projects that use their free service Coverity Scan, see
http://www.coverity.com/company/press-releases/read/coverity-introduces-mont...
Two days ago I had a lovely phone interview about my involvement in the Python project and our development style. They are going to release a nice article in a couple of weeks. In the mean time we have time to fix the remaining couple issues. We *might* be able to reach the highest coverity integrity level! I have dealt with all major issues so we just have to fix a couple of issues.
Current stats -------------
Lines of Code: 396,179 Defect Density: 0.05 Total defects: 1,054 Outstanding: 21 (Coverity Connect shows less) Dismissed: 222 Fixed: 811
http://i.imgur.com/NoELjcj.jpg http://i.imgur.com/eJSzTUX.jpg
open issues -----------
http://bugs.python.org/issue17899 http://bugs.python.org/issue18556 http://bugs.python.org/issue18555 http://bugs.python.org/issue18552 http://bugs.python.org/issue18551 http://bugs.python.org/issue18550 http://bugs.python.org/issue18528
Christian
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/eliben%40gmail.com
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org
participants (5)
-
Antoine Pitrou
-
Brett Cannon
-
Christian Heimes
-
Eli Bendersky
-
Terry Reedy