-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Antoine Pitrou wrote:
On Wed, 7 Jul 2010 19:44:31 +0200 Eli Bendersky <eliben@gmail.com> wrote:
For what it's worth, my benchmarking showed that modifying the heuristic to only kick in when there are more than 100 kinds of elements (Terry's option A) didn't affect the runtime of matching whatsoever, even when the heuristic *does* kick in. All it adds, really, is the overhead of a single 'if' statement. So it wouldn't be right to assume that somehow modifying the heuristic or allowing to turn it off will negatively affect performance in the special case Tim originally optimized for.
Just because it doesn't affect performance in your tests doesn't mean it won't do so in the general case. Consider a case where Tim's junk optimization kicked in and helped improve performance a lot, but where there are still less than 100 alphabet symbols. The new heuristic will ruin this use case.
That would describe pretty much every C program ever written, for instance, and nearly as high a percentage of all Python modules / scripts ever written: the 'string.printable' set, less formfeed and vertical tab, is 98 characters long.
That's why I'm advocating a dedicated flag instead.
+1. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkw032wACgkQ+gerLs4ltQ5xigCfVLhTzFX733cZAO2Jv6JZQm0i HoIAmQEnTyxa2oLAuE22M7FZHUS00xDu =WYt2 -----END PGP SIGNATURE-----