Standardize error message for non-pickleable types
When you try to to pickle or copy a non-pickleable object, you will get an error. In most cases this will be a TypeError with one of few similar, but different variants: "can't pickle XXX objects" (default) "Cannot serialize XXX object" (socket, BZ2Compressor, BZ2Decompressor) "can not serialize a 'XXX' object" (buffered files in _pyio) "cannot serialize 'XXX' object" (FileIO, TextWrapperIO, WinConsoleIO, buffered files in _io, LZMACompressor, LZMADecompressor) "cannot serialize {} object" (proposed for SSLContext) Perhaps some of them where added without deep thinking and then were replicated in different places. I'm going to replace all of them with a standardized error message. But I'm unsure what variant is better. 1. "pickle" or "serialize"? 2. "can't", "Cannot", "can not" or "cannot"? 3. "object" or "objects"? 4. Use the "a" article or not? 5. Use quotes around type name or not? Please help me to choose the best variant.
Le lun. 29 oct. 2018 à 20:42, Serhiy Storchaka <storchaka@gmail.com> a écrit :
1. "pickle" or "serialize"?
serialize
2. "can't", "Cannot", "can not" or "cannot"?
cannot
3. "object" or "objects"?
object
4. Use the "a" article or not?
no: "cannot serialize xxx object" (but i'm not a native english speaker, so don't trust me :-))
5. Use quotes around type name or not?
Use repr() in Python, but use '%s' is C since it would be too complex to write code to properly implement repr() (decode tp_name from UTF-8, handle error, call repr, handle error, etc.). To use repr() on tp_name, I would prefer to have a new formatter, see the thread of last month. https://mail.python.org/pipermail/python-dev/2018-September/155150.html Victor
On 10/29/2018 12:51 PM, Victor Stinner wrote:
Le lun. 29 oct. 2018 à 20:42, Serhiy Storchaka <storchaka@gmail.com> a écrit :
1. "pickle" or "serialize"? serialize
2. "can't", "Cannot", "can not" or "cannot"? cannot
3. "object" or "objects"? object
4. Use the "a" article or not? no: "cannot serialize xxx object" (but i'm not a native english speaker, so don't trust me :-))
Cannot serialize an object of type 'XXX'
5. Use quotes around type name or not? Use repr() in Python, but use '%s' is C since it would be too complex to write code to properly implement repr() (decode tp_name from UTF-8, handle error, call repr, handle error, etc.).
To use repr() on tp_name, I would prefer to have a new formatter, see the thread of last month. https://mail.python.org/pipermail/python-dev/2018-September/155150.html
Victor _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/v%2Bpython%40g.nevcal.com
On Oct 29, 2018, at 12:51, Victor Stinner <vstinner@redhat.com> wrote:
4. Use the "a" article or not?
no: "cannot serialize xxx object" (but i'm not a native english speaker, so don't trust me :-))
It should be fine to leave off the indefinite article.
5. Use quotes around type name or not?
Ideally yes, if it’s easy to implement. -Barry
On Mon, Oct 29, 2018 at 08:51:34PM +0100, Victor Stinner wrote:
Le lun. 29 oct. 2018 à 20:42, Serhiy Storchaka <storchaka@gmail.com> a écrit :
1. "pickle" or "serialize"?
serialize
-1 Serializing is more general; pickle is merely one form of serializing: https://en.wikipedia.org/wiki/Comparison_of_data_serialization_formats When practical, error messages should be more specific, not less. We don't say "arithmetic operation by zero" for division by zero errors, we specify which arithmetic operation failed. Unlike most serialization formats, "pickle" is both a noun (the name of the format) and the verb to convert to that format. -- Steve
On 2018-10-29 22:21, Steven D'Aprano wrote:
On Mon, Oct 29, 2018 at 08:51:34PM +0100, Victor Stinner wrote:
Le lun. 29 oct. 2018 à 20:42, Serhiy Storchaka <storchaka@gmail.com> a écrit :
1. "pickle" or "serialize"?
serialize
-1
Serializing is more general; pickle is merely one form of serializing:
https://en.wikipedia.org/wiki/Comparison_of_data_serialization_formats
When practical, error messages should be more specific, not less. We don't say "arithmetic operation by zero" for division by zero errors, we specify which arithmetic operation failed.
Unlike most serialization formats, "pickle" is both a noun (the name of the format) and the verb to convert to that format.
And if you're marshalling, then saying "marshal" is more helpful.
On 2018-10-29 19:38, Serhiy Storchaka wrote:
When you try to to pickle or copy a non-pickleable object, you will get an error. In most cases this will be a TypeError with one of few similar, but different variants:
"can't pickle XXX objects" (default) "Cannot serialize XXX object" (socket, BZ2Compressor, BZ2Decompressor) "can not serialize a 'XXX' object" (buffered files in _pyio) "cannot serialize 'XXX' object" (FileIO, TextWrapperIO, WinConsoleIO, buffered files in _io, LZMACompressor, LZMADecompressor) "cannot serialize {} object" (proposed for SSLContext)
Perhaps some of them where added without deep thinking and then were replicated in different places. I'm going to replace all of them with a standardized error message. But I'm unsure what variant is better.
1. "pickle" or "serialize"?
2. "can't", "Cannot", "can not" or "cannot"?
3. "object" or "objects"?
4. Use the "a" article or not?
5. Use quotes around type name or not?
Please help me to choose the best variant.
1. If you're pickling, then saying "pickle" is more helpful. 2. In English the usual long form is "cannot". Error messages tend to avoid abbreviations, and also tend to have lowercase after the colon, e.g.: "ZeroDivisionError: division by zero" "ValueError: invalid literal for int() with base 10: 'foo'" 3. If it's failing on an object (singular), then it's clearer to say "object". 4. Articles tend to be omitted. 5. Error messages tend to have quotes around the type name. Therefore, my preference is for: "cannot pickle 'XXX' object"
On 10/29/2018 5:17 PM, MRAB wrote:
On 2018-10-29 19:38, Serhiy Storchaka wrote:
When you try to to pickle or copy a non-pickleable object, you will get an error. In most cases this will be a TypeError with one of few similar, but different variants:
"can't pickle XXX objects" (default) "Cannot serialize XXX object" (socket, BZ2Compressor, BZ2Decompressor) "can not serialize a 'XXX' object" (buffered files in _pyio) "cannot serialize 'XXX' object" (FileIO, TextWrapperIO, WinConsoleIO, buffered files in _io, LZMACompressor, LZMADecompressor) "cannot serialize {} object" (proposed for SSLContext)
Perhaps some of them where added without deep thinking and then were replicated in different places. I'm going to replace all of them with a standardized error message.
Great idea.
But I'm unsure what variant is better.
1. "pickle" or "serialize"?
2. "can't", "Cannot", "can not" or "cannot"?
3. "object" or "objects"?
4. Use the "a" article or not?
5. Use quotes around type name or not?
Please help me to choose the best variant.
1. If you're pickling, then saying "pickle" is more helpful.
2. In English the usual long form is "cannot". Error messages tend to avoid abbreviations,
Agree x 3
and also tend to have lowercase after the colon, e.g.:
"ZeroDivisionError: division by zero"
"ValueError: invalid literal for int() with base 10: 'foo'"
I had not noticed, but IndexError: list index out of range NameError: name 'sqrt' is not defined
3. If it's failing on an object (singular), then it's clearer to say "object".
4. Articles tend to be omitted.
Grammatically, the two examples above could/should start with 'The'. But that is routinely omitted. Matching a/an to 'xxx' would be a terrible nuisance. "a 'str'" (a string)?, "an 'str'" (an ess tee ar)?
5. Error messages tend to have quotes around the type name.
Therefore, my preference is for:
"cannot pickle 'XXX' object" +1
-- Terry Jan Reedy
29.10.18 23:17, MRAB пише:
1. If you're pickling, then saying "pickle" is more helpful.
2. In English the usual long form is "cannot". Error messages tend to avoid abbreviations, and also tend to have lowercase after the colon, e.g.:
"ZeroDivisionError: division by zero"
"ValueError: invalid literal for int() with base 10: 'foo'"
3. If it's failing on an object (singular), then it's clearer to say "object".
4. Articles tend to be omitted.
5. Error messages tend to have quotes around the type name.
Therefore, my preference is for:
"cannot pickle 'XXX' object"
Thank you Matthew, I'll use your variant. Will something change the fact that in all these cases the pickling will be failed not just for specific object, but for all instances of the specified type?
29.10.18 23:17, MRAB пише:
1. If you're pickling, then saying "pickle" is more helpful.
2. In English the usual long form is "cannot". Error messages tend to avoid abbreviations, and also tend to have lowercase after the colon, e.g.:
"ZeroDivisionError: division by zero"
"ValueError: invalid literal for int() with base 10: 'foo'"
3. If it's failing on an object (singular), then it's clearer to say "object".
4. Articles tend to be omitted.
5. Error messages tend to have quotes around the type name.
Therefore, my preference is for:
"cannot pickle 'XXX' object"
Thank you Matthew, I'll use your variant.
Will something change the fact that in all these cases the pickling will be failed not just for specific object, but for all instances of the specified type? That's why I suggested "object of type 'XXX'", to leave the type in a more prominent position, as it is generally more important to the issue
On 10/30/2018 1:12 AM, Serhiy Storchaka wrote: than the object.
On 2018-10-30 08:12, Serhiy Storchaka wrote:
29.10.18 23:17, MRAB пише:
1. If you're pickling, then saying "pickle" is more helpful.
2. In English the usual long form is "cannot". Error messages tend to avoid abbreviations, and also tend to have lowercase after the colon, e.g.:
"ZeroDivisionError: division by zero"
"ValueError: invalid literal for int() with base 10: 'foo'"
3. If it's failing on an object (singular), then it's clearer to say "object".
4. Articles tend to be omitted.
5. Error messages tend to have quotes around the type name.
Therefore, my preference is for:
"cannot pickle 'XXX' object"
Thank you Matthew, I'll use your variant.
Will something change the fact that in all these cases the pickling will be failed not just for specific object, but for all instances of the specified type?
Well, the other examples you gave did not say explicitly that all instances of that type would fail. If you look at what 'hash' says:
hash(()) 3527539 hash(([])) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'list'
that would suggest "TypeError: unpicklable type: 'list'", but I'm not sure I'd like too much of "unpicklable", "unmarshallable", "unserializable", etc. :-)
Le lun. 29 oct. 2018 à 22:20, MRAB <python@mrabarnett.plus.com> a écrit :
1. If you're pickling, then saying "pickle" is more helpful.
I'm not sure that it's really possible to know if the error occurs while pickle is trying to serialize an object, or if it's a different serialization protocol. We are talking about the very generic __getstate__() method. I'm in favor of being more general and say "cannot serialize". Victor
participants (7)
-
Barry Warsaw
-
Glenn Linderman
-
MRAB
-
Serhiy Storchaka
-
Steven D'Aprano
-
Terry Reedy
-
Victor Stinner