Re: [Python-Dev] cpython (2.6): - Issue #13703: oCERT-2011-003: add -R command-line option and PYTHONHASHSEED
On Tue, 21 Feb 2012 02:44:32 +0100 barry.warsaw <python-checkins@python.org> wrote:
+ This is intended to provide protection against a denial-of-service caused by + carefully-chosen inputs that exploit the worst case performance of a dict + insertion, O(n^2) complexity. See + http://www.ocert.org/advisories/ocert-2011-003.html for details.
The worst case performance of a dict insertion is O(n) (not counting potential resizes, whose cost is amortized by the overallocation heuristic). It's dict construction that has O(n**2) worst case complexity.
@@ -1232,9 +1233,9 @@ flags__doc__, /* doc */ flags_fields, /* fields */ #ifdef RISCOS + 17 +#else 16 -#else - 15 #endif
Changing the sequence size of sys.flags can break existing code (e.g. tuple-unpacking). Regards Antoine.
2012/2/20 Antoine Pitrou <solipsis@pitrou.net>:
On Tue, 21 Feb 2012 02:44:32 +0100 barry.warsaw <python-checkins@python.org> wrote:
+ This is intended to provide protection against a denial-of-service caused by + carefully-chosen inputs that exploit the worst case performance of a dict + insertion, O(n^2) complexity. See + http://www.ocert.org/advisories/ocert-2011-003.html for details.
The worst case performance of a dict insertion is O(n) (not counting potential resizes, whose cost is amortized by the overallocation heuristic). It's dict construction that has O(n**2) worst case complexity.
@@ -1232,9 +1233,9 @@ flags__doc__, /* doc */ flags_fields, /* fields */ #ifdef RISCOS + 17 +#else 16 -#else - 15 #endif
Changing the sequence size of sys.flags can break existing code (e.g. tuple-unpacking).
I told George I didn't think it was a major problem. How much code have you seen trying to upack sys.flags? (Moreover, such code would have been broken by previous minor releases.) -- Regards, Benjamin
Le 21/02/2012 03:04, Benjamin Peterson a écrit :
2012/2/20 Antoine Pitrou <solipsis@pitrou.net>:
Changing the sequence size of sys.flags can break existing code (e.g. tuple-unpacking).
I told George I didn't think it was a major problem. How much code have you seen trying to upack sys.flags? (Moreover, such code would have been broken by previous minor releases.)
If by “minor” you mean the Y in Python X.Y.Z, then I think the precedent does not apply here: people expect to have to check their code when going from X.Y to X.Y+1, but not when they update X.Y.Z to X.Y.Z+1. But I agree this is rather theoretical, as I don’t see why anyone would iterate over sys.flags. The important point IMO is having clear policies for us and our users and sticking with them; here the decision was that adding a new flag in a bugfix release was needed, so it’s fine. Regards
Two more small details to address, and then I think we're ready to start creating release candidates. - sys.flags.hash_randomization In the tracker issue, I had previously stated a preference that this flag only reflect the state of the -R command line option, not the $PYTHONHASHSEED environment variable. Well, that's not the way other options/envars such as -O/$PYTHONOPTIMIZE work. sys.flags.optimize gets set if either of those two things set it, so sys.flags.hash_randomization needs to follow that convention. Thus no change is necessary here. - sys.hash_seed In the same tracker issue, I expressed my opinion that the hash seed should be exposed in sys.hash_seed for reproducibility. There's a complication that Victor first mentioned in IRC, but I didn't quite understand the implications of at first. When PYTHONHASHSEED=random is set, there *is no* hash seed. We pull random data straight out of urandom and use that directly as the secret, so there's nothing to expose in sys.hash_seed. In that case, sys.hash_seed is pretty much redundant, since Python code could just check getenv('PYTHONHASHSEED') and be done with it. I don't think there's anything useful to expose to Python or communicated between Python executables when truly random hash data is used. Thus, unless there are objections, I consider the current state of the Python 2.6 branch to be finished wrt issue 13703. Cheers, -Barry
On Wed, Feb 22, 2012 at 10:14 AM, Barry Warsaw <barry@python.org> wrote:
Two more small details to address, and then I think we're ready to start creating release candidates.
- sys.flags.hash_randomization
In the tracker issue, I had previously stated a preference that this flag only reflect the state of the -R command line option, not the $PYTHONHASHSEED environment variable. Well, that's not the way other options/envars such as -O/$PYTHONOPTIMIZE work. sys.flags.optimize gets set if either of those two things set it, so sys.flags.hash_randomization needs to follow that convention. Thus no change is necessary here.
- sys.hash_seed
In the same tracker issue, I expressed my opinion that the hash seed should be exposed in sys.hash_seed for reproducibility. There's a complication that Victor first mentioned in IRC, but I didn't quite understand the implications of at first. When PYTHONHASHSEED=random is set, there *is no* hash seed. We pull random data straight out of urandom and use that directly as the secret, so there's nothing to expose in sys.hash_seed.
In that case, sys.hash_seed is pretty much redundant, since Python code could just check getenv('PYTHONHASHSEED') and be done with it. I don't think there's anything useful to expose to Python or communicated between Python executables when truly random hash data is used.
Thus, unless there are objections, I consider the current state of the Python 2.6 branch to be finished wrt issue 13703.
+10
participants (5)
-
Antoine Pitrou -
Barry Warsaw -
Benjamin Peterson -
Gregory P. Smith -
Éric Araujo