[Numpy-discussion] String accessor methods

Todd toddrjen at gmail.com
Sat Mar 6 09:44:17 EST 2021


Currently. working with strings in numpy is not very convenient. You have
to use a separate set of functions in a separate namespace, and those
functions are relatively limited and poorly-documented.

A solution several other projects, including pandas [0] and xarray [1],
have found are string accessor methods. These are a set of methods attached
to a `str` attribute of the class.  These have the advantage that they are
always available and have a well-defined object they operate on.  On
non-str dtypes, it would raise an exception.

This would also provide a standardized set of methods and behaviors that
are part of the numpy api that other classes could depend on.

An example would be something like this:

>>> mystr = np.array(["test first", "test second", "test third"])
>>> mystr.str.title()
array(['Test First', 'Test Second', 'Test Third'], dtype='<U11')

[0]
https://pandas.pydata.org/pandas-docs/stable/user_guide/text.html#string-methods
[1]
https://xarray.pydata.org/en/stable/generated/xarray.core.accessor_str.StringAccessor.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20210306/68f979fa/attachment.html>


More information about the NumPy-Discussion mailing list