Mailman 3 segfaults due to hash randomization in C OrderedDict - Python-Dev

segfaults due to hash randomization in C OrderedDict

Eric Snow

21 May 2015 21 May '15

9:55 a.m.

(see http://bugs.python.org/issue16991) I an working on resolving an intermittent segfault that my C OrderedDict patch introduces. The failure happens in test_configparser (RawConfigParser uses OrderedDict internally), but only sporadically. However, Ned pointed out to me that it appears to be related to hash randomization, which I have verified. I'm looking into it. In the meantime, here's a specific question. What would lead to the pattern of failures I'm seeing? I've verified that the segfault happens consistently for certain hash randomization seeds and never for the rest. I don't immediately recognize the pattern but expect that it would shed some light on where the problem lies. I ran the following command with the OrderedDict patch applied: for i in `seq 1 100`; do echo $i; PYTHONHASHSEED=$i ./python -m test.regrtest -m test_basic test_configparser ; done Through 100 I get segfaults with seeds of 7, 15, 35, 37, 39, 40, 42, 47, 50, 66, 67, 85, 87, 88, and 92. I expect the distribution across all seeds is uniform, but I haven't verified that. Thoughts? -eric

Show replies by thread

MRAB

21 May 21 May

12:17 p.m.

New subject: segfaults due to hash randomization in C OrderedDict

On 2015-05-21 15:55, Eric Snow wrote:

...

(see http://bugs.python.org/issue16991)

I an working on resolving an intermittent segfault that my C OrderedDict patch introduces. The failure happens in test_configparser (RawConfigParser uses OrderedDict internally), but only sporadically. However, Ned pointed out to me that it appears to be related to hash randomization, which I have verified. I'm looking into it.

In the meantime, here's a specific question. What would lead to the pattern of failures I'm seeing? I've verified that the segfault happens consistently for certain hash randomization seeds and never for the rest. I don't immediately recognize the pattern but expect that it would shed some light on where the problem lies. I ran the following command with the OrderedDict patch applied:

for i in `seq 1 100`; do echo $i; PYTHONHASHSEED=$i ./python -m test.regrtest -m test_basic test_configparser ; done

Through 100 I get segfaults with seeds of 7, 15, 35, 37, 39, 40, 42, 47, 50, 66, 67, 85, 87, 88, and 92. I expect the distribution across all seeds is uniform, but I haven't verified that.

Thoughts?

In "_odict_get_index", for example (there are others), you're caching "ma_keys": PyDictKeysObject *keys = ((PyDictObject *)od)->ma_keys; If it resizes, you go back to the label "start", which is after that line, but could "ma_keys" change when it's resized?

Eric Snow

4:52 p.m.

New subject: segfaults due to hash randomization in C OrderedDict

Good catch. Unfortunately, sticking "keys = ((PyDictObject *)od)->ma_keys;" right after "hash = ..." did not make a difference. I still get the same segfault. -eric On Thu, May 21, 2015 at 11:17 AM, MRAB wrote:

...

On 2015-05-21 15:55, Eric Snow wrote:

...
(see http://bugs.python.org/issue16991)

I an working on resolving an intermittent segfault that my C OrderedDict patch introduces. The failure happens in test_configparser (RawConfigParser uses OrderedDict internally), but only sporadically. However, Ned pointed out to me that it appears to be related to hash randomization, which I have verified. I'm looking into it.

In the meantime, here's a specific question. What would lead to the pattern of failures I'm seeing? I've verified that the segfault happens consistently for certain hash randomization seeds and never for the rest. I don't immediately recognize the pattern but expect that it would shed some light on where the problem lies. I ran the following command with the OrderedDict patch applied:

for i in `seq 1 100`; do echo $i; PYTHONHASHSEED=$i ./python -m test.regrtest -m test_basic test_configparser ; done

Through 100 I get segfaults with seeds of 7, 15, 35, 37, 39, 40, 42, 47, 50, 66, 67, 85, 87, 88, and 92. I expect the distribution across all seeds is uniform, but I haven't verified that.

Thoughts?

In "_odict_get_index", for example (there are others), you're caching "ma_keys":

PyDictKeysObject *keys = ((PyDictObject *)od)->ma_keys;

If it resizes, you go back to the label "start", which is after that line, but could "ma_keys" change when it's resized?

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ericsnowcurrently%40gmail...

MRAB

5:06 p.m.

New subject: segfaults due to hash randomization in C OrderedDict

On 2015-05-21 22:52, Eric Snow wrote:

...

Good catch. Unfortunately, sticking "keys = ((PyDictObject *)od)->ma_keys;" right after "hash = ..." did not make a difference. I still get the same segfault.

So, does it change sometimes?

...

On Thu, May 21, 2015 at 11:17 AM, MRAB

wrote:

...

...
On 2015-05-21 15:55, Eric Snow wrote:

...
(see http://bugs.python.org/issue16991)

I an working on resolving an intermittent segfault that my C OrderedDict patch introduces. The failure happens in test_configparser (RawConfigParser uses OrderedDict internally), but only sporadically. However, Ned pointed out to me that it appears to be related to hash randomization, which I have verified. I'm looking into it.

In the meantime, here's a specific question. What would lead to the pattern of failures I'm seeing? I've verified that the segfault happens consistently for certain hash randomization seeds and never for the rest. I don't immediately recognize the pattern but expect that it would shed some light on where the problem lies. I ran the following command with the OrderedDict patch applied:

for i in `seq 1 100`; do echo $i; PYTHONHASHSEED=$i ./python -m test.regrtest -m test_basic test_configparser ; done

Through 100 I get segfaults with seeds of 7, 15, 35, 37, 39, 40, 42, 47, 50, 66, 67, 85, 87, 88, and 92. I expect the distribution across all seeds is uniform, but I haven't verified that.

Thoughts?

In "_odict_get_index", for example (there are others), you're caching "ma_keys":

PyDictKeysObject *keys = ((PyDictObject *)od)->ma_keys;

If it resizes, you go back to the label "start", which is after that line, but could "ma_keys" change when it's resized?

Eric Snow

5:17 p.m.

New subject: segfaults due to hash randomization in C OrderedDict

On Thu, May 21, 2015 at 4:06 PM, MRAB wrote:

...

On 2015-05-21 22:52, Eric Snow wrote:

...
Good catch. Unfortunately, sticking "keys = ((PyDictObject *)od)->ma_keys;" right after "hash = ..." did not make a difference. I still get the same segfault.

So, does it change sometimes?

The segfault is consistent if I use the same seed (e.g. 7): PYTHONHASHSEED=7 ./python -m test.regrtest -m test_basic test_configparser Some seeds always segfault and some seeds never segfault. -eric

MRAB

5:41 p.m.

New subject: segfaults due to hash randomization in C OrderedDict

On 2015-05-21 23:17, Eric Snow wrote:

...

On Thu, May 21, 2015 at 4:06 PM, MRAB wrote:

...
On 2015-05-21 22:52, Eric Snow wrote:

...
Good catch. Unfortunately, sticking "keys = ((PyDictObject *)od)->ma_keys;" right after "hash = ..." did not make a difference. I still get the same segfault.

So, does it change sometimes?

The segfault is consistent if I use the same seed (e.g. 7):

PYTHONHASHSEED=7 ./python -m test.regrtest -m test_basic test_configparser

Some seeds always segfault and some seeds never segfault.

OK, another thought. In "_odict_get_index" again, you say that if the hash has changed, the dict might've been resized, but could the dict be resized _without_ the hash changing? Could the value of "keys" still become invalid even if the hash is the same?

Eric Snow

6:22 p.m.

New subject: segfaults due to hash randomization in C OrderedDict

On Thu, May 21, 2015 at 4:41 PM, MRAB wrote:

...

On 2015-05-21 23:17, Eric Snow wrote:

...
The segfault is consistent if I use the same seed (e.g. 7):

PYTHONHASHSEED=7 ./python -m test.regrtest -m test_basic test_configparser

Some seeds always segfault and some seeds never segfault.

OK, another thought.

In "_odict_get_index" again, you say that if the hash has changed, the dict might've been resized, but could the dict be resized _without_ the hash changing?

Could the value of "keys" still become invalid even if the hash is the same?

Good question. The only way I can see here that the dict would resize is during re-entrance to the interpreter eval loop via Python code potentially triggered through the PyObject_Hash call. Also, there's no check for a changed hash. The code compares the size of ma_keys (effectively the dict keys hash table) against the size of of the odict "fast nodes" table. -eric

MRAB

6:55 p.m.

New subject: segfaults due to hash randomization in C OrderedDict

On 2015-05-22 00:22, Eric Snow wrote:

...

On Thu, May 21, 2015 at 4:41 PM, MRAB wrote:

...
On 2015-05-21 23:17, Eric Snow wrote:

...
The segfault is consistent if I use the same seed (e.g. 7):

PYTHONHASHSEED=7 ./python -m test.regrtest -m test_basic test_configparser

Some seeds always segfault and some seeds never segfault.

OK, another thought.

In "_odict_get_index" again, you say that if the hash has changed, the dict might've been resized, but could the dict be resized _without_ the hash changing?

Could the value of "keys" still become invalid even if the hash is the same?

Good question. The only way I can see here that the dict would resize is during re-entrance to the interpreter eval loop via Python code potentially triggered through the PyObject_Hash call.

Also, there's no check for a changed hash. The code compares the size of ma_keys (effectively the dict keys hash table) against the size of of the odict "fast nodes" table. Ah, OK.

I'm not looking at the use of "PyTuple_Pack". As I understand it, "PyTuple_Pack" borrows the references of the objects passed, and when the tuple itself is DECREFed, those objects will be DECREFed "odict_reduce" calls "PyTuple_Pack", passing 1 or 2 references to Py_None which aren't INCREFed first, so could there be a bug there? (There might be similar issues in other functions.)

Eric Snow

7:12 p.m.

New subject: segfaults due to hash randomization in C OrderedDict

On Thu, May 21, 2015 at 5:55 PM, MRAB wrote:

...

I'm not looking at the use of "PyTuple_Pack". As I understand it, "PyTuple_Pack" borrows the references of the objects passed, and when the tuple itself is DECREFed, those objects will be DECREFed

...

From the docs [1] it seems that PyTuple_Pack does not steal any references and it returns a new reference. Perhaps you were thinking of PyTuple_SetItem (and PyTuple_SET_ITEM)?

[1] https://docs.python.org/3.5//c-api/tuple.html

...

"odict_reduce" calls "PyTuple_Pack", passing 1 or 2 references to Py_None which aren't INCREFed first, so could there be a bug there? (There might be similar issues in other functions.)

Alas, I don't think it is. :( I'll point out that the configparser test in question does a lot of resizes. It may be that the problem only surfaces after many resizes and apparently only for certain hash randomization seeds. At the moment I'm looking at how hash randomization impacts resizing. I'm certainly seeing that the resizes happen at different item counts depending on the seed. -eric

MRAB

7:22 p.m.

New subject: segfaults due to hash randomization in C OrderedDict

On 2015-05-22 01:12, Eric Snow wrote:

...

On Thu, May 21, 2015 at 5:55 PM, MRAB wrote:

...
I'm not looking at the use of "PyTuple_Pack". As I understand it, "PyTuple_Pack" borrows the references of the objects passed, and when the tuple itself is DECREFed, those objects will be DECREFed

...
From the docs [1] it seems that PyTuple_Pack does not steal any references and it returns a new reference. Perhaps you were thinking of PyTuple_SetItem (and PyTuple_SET_ITEM)?

[1] https://docs.python.org/3.5//c-api/tuple.html

...
"odict_reduce" calls "PyTuple_Pack", passing 1 or 2 references to Py_None which aren't INCREFed first, so could there be a bug there? (There might be similar issues in other functions.)

Alas, I don't think it is. :( I'd come to the same conclusion.

Oh, well, I'll keep looking...

...

I'll point out that the configparser test in question does a lot of resizes. It may be that the problem only surfaces after many resizes and apparently only for certain hash randomization seeds. At the moment I'm looking at how hash randomization impacts resizing. I'm certainly seeing that the resizes happen at different item counts depending on the seed.

Eric Snow

7:30 p.m.

New subject: segfaults due to hash randomization in C OrderedDict

On Thu, May 21, 2015 at 6:22 PM, MRAB wrote:

...

Oh, well, I'll keep looking...

Thanks! -eric

Eric Snow

9:42 p.m.

New subject: segfaults due to hash randomization in C OrderedDict

On Thu, May 21, 2015 at 6:22 PM, MRAB wrote:

...

Oh, well, I'll keep looking...

I've posted some data to http://bugs.python.org/issue16991 that I hope will shed some light on the issue. We can continue the conversation there. -eric

3259

Age (days ago)

3260

Last active (days ago)

List overview

Download

11 comments

2 participants

participants (2)

Eric Snow
MRAB

segfaults due to hash randomization in C OrderedDict

tags

participants (2)