[Python-checkins] r65700 - in python/trunk: Lib/email/message.py Misc/NEWS
Jack Diederich
jackdied at jackdied.com
Sat Aug 16 05:02:09 CEST 2008
On Fri, Aug 15, 2008 at 11:03:21PM +0200, antoine.pitrou wrote:
See my comments on the tracker http://bugs.python.org/issue2676
This patch is suboptimal for bad cases (but still better than the old behavior).
See my patch and tracker comments for why.
-Jack
> Log:
> #2676: email/message.py [Message.get_content_type]: Trivial regex hangs on pathological input
>
>
>
> Modified:
> python/trunk/Lib/email/message.py
> python/trunk/Misc/NEWS
>
> Modified: python/trunk/Lib/email/message.py
> ==============================================================================
> --- python/trunk/Lib/email/message.py (original)
> +++ python/trunk/Lib/email/message.py Fri Aug 15 23:03:21 2008
> @@ -19,18 +19,22 @@
>
> SEMISPACE = '; '
>
> -# Regular expression used to split header parameters. BAW: this may be too
> -# simple. It isn't strictly RFC 2045 (section 5.1) compliant, but it catches
> -# most headers found in the wild. We may eventually need a full fledged
> -# parser eventually.
> -paramre = re.compile(r'\s*;\s*')
> # Regular expression that matches `special' characters in parameters, the
> # existance of which force quoting of the parameter value.
> tspecials = re.compile(r'[ \(\)<>@,;:\\"/\[\]\?=]')
>
>
> -
> # Helper functions
> +def _splitparam(param):
> + # Split header parameters. BAW: this may be too simple. It isn't
> + # strictly RFC 2045 (section 5.1) compliant, but it catches most headers
> + # found in the wild. We may eventually need a full fledged parser
> + # eventually.
> + a, sep, b = param.partition(';')
> + if not sep:
> + return a.strip(), None
> + return a.strip(), b.strip()
> +
> def _formatparam(param, value=None, quote=True):
> """Convenience function to format and return a key=value pair.
>
> @@ -436,7 +440,7 @@
> if value is missing:
> # This should have no parameters
> return self.get_default_type()
> - ctype = paramre.split(value)[0].lower().strip()
> + ctype = _splitparam(value)[0].lower()
> # RFC 2045, section 5.2 says if its invalid, use text/plain
> if ctype.count('/') != 1:
> return 'text/plain'
>
> Modified: python/trunk/Misc/NEWS
> ==============================================================================
> --- python/trunk/Misc/NEWS (original)
> +++ python/trunk/Misc/NEWS Fri Aug 15 23:03:21 2008
> @@ -48,6 +48,10 @@
> Library
> -------
>
> +- Issue #2676: in the email package, content-type parsing was hanging on
> + pathological input because of quadratic or exponential behaviour of a
> + regular expression.
> +
> - Issue #3476: binary buffered reading through the new "io" library is now
> thread-safe.
>
> _______________________________________________
> Python-checkins mailing list
> Python-checkins at python.org
> http://mail.python.org/mailman/listinfo/python-checkins
>
More information about the Python-checkins
mailing list