A regex that's vulnerable to pathological behavior is a DoS attack waiting to happen. Especially when used for parsing log data (which might contain untrusted data). If possible, we should make it harder for people to shoot themselves in the feet.

And this is exactly what happened to me. I have a job that automatically parses logs as they are uploaded, and a log came in that had an unexpected pattern that triggered pathological behavior in my regex that did not occur when processing the expected input.  This caused the import pipeline to back up for many hours before I noticed and fixed it.
 
While definitely not as bad and not as likely as SQL injection, I think the possibility of regex DoS is totally missing in the stdlib re docs. Should there be something added there about if you need to put user input into an expression, best practice is to re.escape it?

Unless I am missing something, I don't see how re.escape would have helped me here. I wasn't trying to treat arbitrary input as a regex, so escaping the regex characters in it wouldn't have done anything to help me. The problem is that a regex *that I wrote* had a bug in it that caused pathological behavior, but it wasn't found during testing because it only occurred when matching against an unexpected input.

--
DataStax Logo SquareJ.B. Langston
Tech Support
Tools Wrangler
+1 650 389 6000 | datastax.com
Find DataStax Online:LinkedIn Logo   Facebook Logo   Twitter Logo   RSS Feed   Github Logo


On Mon, Feb 14, 2022 at 3:59 PM Nick Timkovich <prometheus235@gmail.com> wrote:
A regex that's vulnerable to pathological behavior is a DoS attack waiting to happen. Especially when used for parsing log data (which might contain untrusted data). If possible, we should make it harder for people to shoot themselves in the feet.

While definitely not as bad and not as likely as SQL injection, I think the possibility of regex DoS is totally missing in the stdlib re docs. Should there be something added there about if you need to put user input into an expression, best practice is to re.escape it?



--
DataStax Logo SquareJ.B. Langston
Tech Support
Tools Wrangler
+1 650 389 6000 | datastax.com
Find DataStax Online:LinkedIn Logo   Facebook Logo   Twitter Logo   RSS Feed   Github Logo