<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">Perhaps not surprising that hasn’t been optimized, because most folks don’t have that many categories.  If you have an actual use-case for that many categories, submitting a bug report on Github would be great.  <div class=""><br class=""></div><div class=""><div class="">Cheers,   Jody<br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Oct 25, 2018, at  16:47 PM, Douglas Clowes <<a href="mailto:douglas.clowes@gmail.com" class="">douglas.clowes@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div dir="ltr" class=""><div dir="ltr" class=""><div dir="ltr" class=""><div dir="ltr" class="">> Strings are now treated as “categories” rather than cast to floats,  and plotted in the order received.<div class=""><br class=""></div><div class="">><a href="https://matplotlib.org/gallery/lines_bars_and_markers/categorical_variables.html" target="_blank" class=""> https://matplotlib.org/gallery/lines_bars_and_markers/categorical_variables.html</a><br class=""><div class=""><br class=""></div><div class="">> Cheers,   Jody</div><div class=""><br class=""></div><div class="">Thanks for that Jody, I did just "get lucky".</div><div class=""><br class=""></div><div class="">Some assessment of this shows the high CPU associated with this operation is at least partially avoidable.</div><div class=""><br class=""></div><div class="">The majority of the CPU time, according to:</div><div class="">  python3 -m cProfile -s time plotit.py -s|head -n20</div><div class="">is in or under StrCategoryFormatter._text which seems to be getting called exponentially more times than I would expect. Of the order number of categories squared in my samples, with 40K calls for 100 categories and 4M for 1000 on mpl 2.2 amd 6M on mpl 3.0. Seems high.<br class=""></div><div class=""><br class=""></div><div class="">Within the _text function in 2.2, the most expensive operation is the constant test of the numpy version. This can be significantly reduced by moving the constant expression with a simple change like:</div><div class=""><br class=""></div><div class="">diff --git a/lib/matplotlib/category.py b/lib/matplotlib/category.py<br class="">index b135bff1c..89b1c5bd9 100644<br class="">--- a/lib/matplotlib/category.py<br class="">+++ b/lib/matplotlib/category.py<br class="">@@ -28,6 +28,8 @@ import matplotlib.ticker as ticker<br class=""> # np 1.6/1.7 support<br class=""> from distutils.version import LooseVersion<br class=""> <br class="">+NP_PRE_1_7_0 = LooseVersion(np.__version__) < LooseVersion('1.7.0')<br class="">+<br class=""> VALID_TYPES = tuple(set(six.string_types +<br class="">                         (bytes, six.text_type, np.str_, np.bytes_)))<br class=""> <br class="">@@ -158,7 +160,7 @@ class StrCategoryFormatter(ticker.Formatter):<br class="">     def _text(value):<br class="">         """Converts text values into `utf-8` or `ascii` strings<br class="">         """<br class="">-        if LooseVersion(np.__version__) < LooseVersion('1.7.0'):<br class="">+        if NP_PRE_1_7_0:<br class="">             if (isinstance(value, (six.text_type, np.unicode))):<br class="">                 value = value.encode('utf-8', 'ignore').decode('utf-8')<br class="">         if isinstance(value, (np.bytes_, six.binary_type)):<br class=""><br class=""></div><div class=""><br class=""></div></div></div></div></div></div></div>

_______________________________________________<br class="">Matplotlib-users mailing list<br class=""><a href="mailto:Matplotlib-users@python.org" class="">Matplotlib-users@python.org</a><br class="">https://mail.python.org/mailman/listinfo/matplotlib-users<br class=""></div></blockquote></div><br class=""></div></div></body></html>