[Matplotlib-users] Plotting Lists of Strings has high CPU

Douglas Clowes douglas.clowes at gmail.com
Thu Oct 25 19:47:19 EDT 2018

> Strings are now treated as “categories” rather than cast to floats,  and
plotted in the order received.


> Cheers,   Jody

Thanks for that Jody, I did just "get lucky".

Some assessment of this shows the high CPU associated with this operation
is at least partially avoidable.

The majority of the CPU time, according to:
  python3 -m cProfile -s time plotit.py -s|head -n20
is in or under StrCategoryFormatter._text which seems to be getting called
exponentially more times than I would expect. Of the order number of
categories squared in my samples, with 40K calls for 100 categories and 4M
for 1000 on mpl 2.2 amd 6M on mpl 3.0. Seems high.

Within the _text function in 2.2, the most expensive operation is the
constant test of the numpy version. This can be significantly reduced by
moving the constant expression with a simple change like:

diff --git a/lib/matplotlib/category.py b/lib/matplotlib/category.py
index b135bff1c..89b1c5bd9 100644
--- a/lib/matplotlib/category.py
+++ b/lib/matplotlib/category.py
@@ -28,6 +28,8 @@ import matplotlib.ticker as ticker
 # np 1.6/1.7 support
 from distutils.version import LooseVersion

+NP_PRE_1_7_0 = LooseVersion(np.__version__) < LooseVersion('1.7.0')
 VALID_TYPES = tuple(set(six.string_types +
                         (bytes, six.text_type, np.str_, np.bytes_)))

@@ -158,7 +160,7 @@ class StrCategoryFormatter(ticker.Formatter):
     def _text(value):
         """Converts text values into `utf-8` or `ascii` strings
-        if LooseVersion(np.__version__) < LooseVersion('1.7.0'):
+        if NP_PRE_1_7_0:
             if (isinstance(value, (six.text_type, np.unicode))):
                 value = value.encode('utf-8', 'ignore').decode('utf-8')
         if isinstance(value, (np.bytes_, six.binary_type)):
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-users/attachments/20181026/396faed9/attachment.html>

More information about the Matplotlib-users mailing list