On Wed, Feb 19, 2020 at 5:23 PM FilippoM <privacy_please@filippo.it> wrote:
Hi, I've got a Pandas data frame that looks like this
In [69]: data.head Out[69]: <bound method NDFrame.head of OS and Version Status 0 Android VIDEO_OK 1 Android 4.2.2 VIDEO_OK 2 Android 9 VIDEO_OK 3 iOS 13.3 VIDEO_OK 4 Windows 10 VIDEO_OK 5 Android 9 VIDEO_OK ... ... 24 Windows 10 VIDEO_OK 25 Android 9 VIDEO_OK 26 Android 6.0.1 VIDEO_OK 27 Windows XP VIDEO_OK 28 Android 8.0.0 VIDEO_FAILURE 29 Android 6.0 VIDEO_OK ... ... 2994 iOS 9.1 VIDEO_OK 2995 Android 9 VIDEO_OK 2996 Windows 10 VIDEO_OK 2997 Android 9 VIDEO_OK 2998 Windows 10 VIDEO_OK 2999 iOS 13.3 VIDEO_OK
with 109 possible values of the OS columns and just two possible values ()VIDEO_OK and VIDEO_FAILURE) in the status column.
How can I use Pandas' dataframe magic to calculate, for each of the possible 109 values, how many have VIDEO_OK, and how many have VIDEO_FAILURE I have respectively?
I would like to end up with something like
In[]: num_of_oks{"iOS 13.3"} Out: 15
In[]: num_of_not_oks{"iOS 13.3"} Out: 3
I am trying to do some matplotlib scatter plotting
Thanks
Have you considered using traditional unix tools, like cut and count? Or traditional SQL.
This last is for discussion of changes to the Python language itself, in particular the CPython reference implementation. Python-list or a Pandas forum are appropriate for this question. That said, it sounds like you want df.value_counts(). But if not, follow-up in a more relevant place. On Thu, Feb 20, 2020, 7:39 AM James Lu <jamtlu@gmail.com> wrote:
On Wed, Feb 19, 2020 at 5:23 PM FilippoM <privacy_please@filippo.it> wrote:
Hi, I've got a Pandas data frame that looks like this
In [69]: data.head Out[69]: <bound method NDFrame.head of OS and Version Status 0 Android VIDEO_OK 1 Android 4.2.2 VIDEO_OK 2 Android 9 VIDEO_OK 3 iOS 13.3 VIDEO_OK 4 Windows 10 VIDEO_OK 5 Android 9 VIDEO_OK ... ... 24 Windows 10 VIDEO_OK 25 Android 9 VIDEO_OK 26 Android 6.0.1 VIDEO_OK 27 Windows XP VIDEO_OK 28 Android 8.0.0 VIDEO_FAILURE 29 Android 6.0 VIDEO_OK ... ... 2994 iOS 9.1 VIDEO_OK 2995 Android 9 VIDEO_OK 2996 Windows 10 VIDEO_OK 2997 Android 9 VIDEO_OK 2998 Windows 10 VIDEO_OK 2999 iOS 13.3 VIDEO_OK
with 109 possible values of the OS columns and just two possible values ()VIDEO_OK and VIDEO_FAILURE) in the status column.
How can I use Pandas' dataframe magic to calculate, for each of the possible 109 values, how many have VIDEO_OK, and how many have VIDEO_FAILURE I have respectively?
I would like to end up with something like
In[]: num_of_oks{"iOS 13.3"} Out: 15
In[]: num_of_not_oks{"iOS 13.3"} Out: 3
I am trying to do some matplotlib scatter plotting
Thanks
Have you considered using traditional unix tools, like cut and count? Or traditional SQL. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/B27PCY... Code of Conduct: http://python.org/psf/codeofconduct/
participants (2)
-
David Mertz
-
James Lu