[Python-ideas] Exploiting type-homogeneity in list.sort() (again!)
Elliot Gorokhovsky
elliot.gorokhovsky at gmail.com
Mon Mar 6 22:41:40 EST 2017
On Mon, Mar 6, 2017, 4:42 PM Jim J. Jewett <jimjjewett at gmail.com> wrote:
(1) Good Job.
Thanks!
(3) Ideally, your graph would have the desired-to-be lines after the
as-is lines; for English writing, that would mean putting your (short
red) lines to the right of the (tall blue) lines.
(4) I suspect colors other than red and blue would be helpful as
well, but I am explicitly not a design wiz. My best suggestion (other
than "ask someone who isn't me") would be to use blue for as-is and
green for to-be.
Ya, I know the colors are terrible, I just have no graphic design
experience so I figured I'd make the graph the same colors as the diagram
(5) I don't know that all-ASCII (or at least all-Latin1) is true for
most applications, but it is certainly true for most datasets run in
countries where latin1 is sufficient, which includes most places
outside of Asia, Africa, and perhaps Eastern Europe. If anything,
that strengthens your case, since you can win on plenty of datasets
even for applications where it isn't always safe, and for those
datasets that do require a wider charset, you're likely to discover
this quickly.
True. I think a much bigger part of the consideration is also that a lot of
software (e.g. file systems) *don't* support unicode, at least by default,
so if you're dealing with text you got from another program or from the OS,
it's usually ASCII. And usually, our shell scripts our getting their text
from other programs (e.g. file names).
(6) When I saw the flowchart around f(v, w), at first I was thinking
about the key function used in some sorting... I suppose that isn't
relevant, since those sorts (If I Recall Correctly) already create a
parallel array to avoid recomputing the keys, but ... it might be
worth clarifying, if you can find a way to do it easily without adding
too much complexity. Maybe just change "this f" to "the compare
function f"?
You are correct -- key sorts create a parallel array. In fact, *all* sorts
create a parallel array, for safety: the only way to make sure the objects
aren't getting mutated as you sort is to keep them safe! If the objects are
getting mutated during the sort, something is clearly going horribly wrong,
but at least you won't segfault. (This is not part of my patch -- it's part
of the original implementation).
(7) I hope you have put in a pull request to get this added to python 3.7.
Thanks! I don't think I can make pull requests, but I am going to submit a
fixed version to the bug tracker (Tim pointed out that my current code
isn't thread-safe or adversary-safe because it stores compare_function in a
global. I have to modify the code keep it in local scope and pass in to
every function that needs it. This will make the diff hairier, but not
increase code complexity.
-jJ
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20170307/ac33392f/attachment.html>
More information about the Python-ideas
mailing list