[New-bugs-announce] [issue44901] Info about used pickle protocol used by multiprocessing.Queue

Christian Buhtz report at bugs.python.org
Thu Aug 12 09:33:02 EDT 2021


New submission from Christian Buhtz <c.buhtz at posteo.jp>:

I read some of the PEPs about pickeling. But I would not say that I understood everything.

Of course I checked the docu about multiprocessing.Queue. Currently it is not clear for me which pickle protocol is used by multiprocessing.Queue.
Maybe I missed something in the docu or the docu can be improved?

 - Is there a fixed default - maybe different between the Python versions?
 - Or is the pickle protocol version dynamicly selected depending on the kind/type/size of data put() into the Queue?

Is there a way to find out at runtime which protocol version is used for a specific Queue instance with a specific piece of data?

Background:
I use Python 3.7 and 3.9 with Pandas 1.3.5.
I parallelize work with hugh(?) pandas.DataFrame objects. I simply cut them into pieces (on row axis) which number is limited to the machines CPU cores (minus 1). The cutting happens several times in my sripts because
for some things I need the data as one complete DataFrame.
Just for example here is one of such pieces which is given to a worker by argument and send back via Queue - 7 workers!

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 226687 entries, 0 to 226686
Data columns (total 38 columns):
 #   Column              Non-Null Count   Dtype
---  ------              --------------   -----
 0   HASH_ ....
 ....
 37  NAME_ORG            226687 non-null  object
dtypes: datetime64[ns](6), float64(1), int64(1), object(30)
memory usage: 65.7+ MB 

I am a bit "scared" that Python wasting my CPU time and does some compression on that data. ;) I just want to get a better idea what is done in the background.

----------
messages: 399447
nosy: buhtz
priority: normal
severity: normal
status: open
title: Info about used pickle protocol used by multiprocessing.Queue
versions: Python 3.7, Python 3.8, Python 3.9

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue44901>
_______________________________________


More information about the New-bugs-announce mailing list