[Python-Dev] marshal and ssize_t (PEP 353)

"Martin v. Löwis" martin at v.loewis.de
Tue May 15 23:41:56 CEST 2007


> I'm looking at Python/marshal.c and there are a lot of places that
> don't support sequences that are larger than would fit into size(int).
> I looked for marshal referenced in the PEP and didn't find anything.
> Was this an oversight or intentional?

These changes were only made after merging the ssize_t branch,
namely in r42883.

They were intentional, in the sense that the ssize_t changes were
meant to *only* change the API. Supporting larger strings would
have been a change to the marshal format as well, and that was not
within the mandate of PEP 353.

Now, if you think the marshal format should change as well to
support large strings, that may be worth considering. There
are two design alternatives:
- change the 's', 't', and 'u' codes to use an 8-byte argument
  That would be an incompatible change that would also blow up
  marshal data which don't need it (by 4 bytes per string value).
- introduce additional codes (like 'S', 'T', and 'U') that take
  8-byte lengths. That would be (forward?) compatible, in that
  old marshal data can be still read in new implementations,
  and mostly backwards-compatible, assuming that S/T/U get used
  only when needed. However, it would complicate the
  implementation.

I'm still leaning towards "don't change", since I don't expect
that such string objects occur in source code, and since I still
think source code / .pyc is/should be the major application
of marshal.

Regards,
Martin


More information about the Python-Dev mailing list