<br><br><div class="gmail_quote">On Fri, Jun 22, 2012 at 9:42 AM, eat <span dir="ltr"><<a href="mailto:e.antero.tammi@gmail.com" target="_blank">e.antero.tammi@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


Hi,<br><br><div class="gmail_quote"><div class="im">On Fri, Jun 22, 2012 at 7:51 AM, Gael Varoquaux <span dir="ltr"><<a href="mailto:gael.varoquaux@normalesup.org" target="_blank">gael.varoquaux@normalesup.org</a>></span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>On Thu, Jun 21, 2012 at 08:59:09PM -0400, Benjamin Root wrote:<br>

>      > munkres seems to be a pure python implementation ;-).<br>

<br>

>      Oops! I could have sworn that I once tried one named munkres that used<br>

>      numpy. But that was several years ago.<br>

<br>

>    There is a development branch of sk-learn with an implementation of the<br>

>    hungarian assignment solver using numpy. It will even do non-square<br>

>    matrices and matrices with an empty dimension.<br>

<br>

</div>Yes, absolutely, thanks to Ben:<br>

<a href="https://github.com/GaelVaroquaux/scikit-learn/blob/hungarian/sklearn/utils/hungarian.py" target="_blank">https://github.com/GaelVaroquaux/scikit-learn/blob/hungarian/sklearn/utils/hungarian.py</a><br>

I never merged this in the main scikit-learn tree, because munkres is not<br>

used so far. Maybe I should merge it in the main tree, or maybe it should<br>

be added to scipy or numpy.<br></blockquote></div><div>I made some simple timing comparisons (see attached picture) between numpy based hungarian and pure python shortest path based hungarian_sp. It seems that pure python based implementation outperforms numpy based implementation. Timings are averaged over five runs.</div>


<div><br></div><div>The difference cannot totally be explained by different algorithms (although shortest path based seem to scale better).  Rather the heavy access to rows and columns seem to favor list of lists. So this type of algorithms may indeed be real challenges for numpy.</div>


<div><br></div></div></blockquote><div><br>eat,<br><br>Thanks for that analysis.  Personally, I never needed high-performance so I never bothered to optimize it.  However, it does appear that there is an order-of-magnitude difference between the two, and so it might be worth it to see what can be done to fix that.<br>


<br>Ben Root<br><br></div></div>