I recently posted a patch to fix a bug: http://python.org/sf/561858. The patch requires changing .pyc magic. Since this bug goes back to 2.1, what is the process for changing .pyc magic in bugfix releases? ie, is it allowed?
In this case the co_stacksize > 32767 and only a short is written to disk. This could be doubled to 65536 (probably should be) without changing the magic. But even that isn't sufficient to solve this problem.
It also brings up a related problem. If the PyCodeObject can't be written to disk, should a .pyc be created at all? The code will run fine the first time, but when imported the second time it will fail.
The other 16 bit values stored are: co_argcount, co_nlocals, co_flags. At least argcount & nlocals aren't too likely to exceed 32k, but co_flags could, which would be silently ignored now.
Neal
I recently posted a patch to fix a bug: http://python.org/sf/561858. The patch requires changing .pyc magic. Since this bug goes back to 2.1, what is the process for changing .pyc magic in bugfix releases? ie, is it allowed?
Absolutely not!!!!! .pyc files must remain 100% compatible!!! (Imagine someone doing a .pyc-only distribution for 2.1.3 and finding that it doesn't work for 2.1.4!)
In this case the co_stacksize > 32767 and only a short is written to disk. This could be doubled to 65536 (probably should be) without changing the magic. But even that isn't sufficient to solve this problem.
I guess the only way to fix this in 2.1.x is to raise an error -- that's better than the crash that will follow if you try to execute that code.
It also brings up a related problem. If the PyCodeObject can't be written to disk, should a .pyc be created at all? The code will run fine the first time, but when imported the second time it will fail.
What do you mean by "can't be written to disk"? Is the disk full? Is there another kind of write error? The magic number is written last, only when the write is successful.
The other 16 bit values stored are: co_argcount, co_nlocals, co_flags. At least argcount & nlocals aren't too likely to exceed 32k, but co_flags could, which would be silently ignored now.
If you're going to change the marshal format anyway, I'd increase all of them to 32 bit ints. After all, I thought the stacksize would never exceed 32K either...
--Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
I recently posted a patch to fix a bug: http://python.org/sf/561858. The patch requires changing .pyc magic. Since this bug goes back to 2.1, what is the process for changing .pyc magic in bugfix releases? ie, is it allowed?
Absolutely not!!!!! .pyc files must remain 100% compatible!!! (Imagine someone doing a .pyc-only distribution for 2.1.3 and finding that it doesn't work for 2.1.4!)
Ok, I'll work on a patch for 2.1/2.2.
In looking through other magic code, I found that when -U was removed. It was only removed from the usage msg. Should the option and code be removed altogether? ie, -U is still used by getopt and still changes the magic, so -U is just a hidden option now.
It also brings up a related problem. If the PyCodeObject can't be written to disk, should a .pyc be created at all? The code will run fine the first time, but when imported the second time it will fail.
What do you mean by "can't be written to disk"? Is the disk full? Is there another kind of write error? The magic number is written last, only when the write is successful.
Disk full was one condition. The other condition was the if a value is 32 bits in memory, but only 16 bits are written to disk. Based on your comment to increase all of the 16 bit values for PyCode, that will no longer be the case. Although, there could be transient write errors and the file could be corrupted. Since only part of the data would be written. One case where this could happen is an interupted system call.
There is one other possible problem. [wr]_short() is now only used in one place: for long.digits which are unsigned ints. But r_short() does sign extension. Is this a problem?
Neal
In looking through other magic code, I found that when -U was removed. It was only removed from the usage msg. Should the option and code be removed altogether? ie, -U is still used by getopt and still changes the magic, so -U is just a hidden option now.
-U is a handy option for developers wanting to test Unicode conformance of their code, but the help message promised more than it could deliver. Please leave this alone.
There is one other possible problem. [wr]_short() is now only used in one place: for long.digits which are unsigned ints. But r_short() does sign extension. Is this a problem?
Long digits are only 15 bits, so if you change it to return an unsigned short that shouldn't matter. Dunno if there's magic for negative numbers though (in memory, the length is negative, but the digits are not).
--Guido van Rossum (home page: http://www.python.org/~guido/)
[Guido]
Long digits are only 15 bits, so if you change it to return an unsigned short that shouldn't matter. Dunno if there's magic for negative numbers though (in memory, the length is negative, but the digits are not).
The marshal format is the same: signed length and unsigned digits. The signed length goes thru [rw]_long.
On 1 Jun 2002 at 10:15, Neal Norwitz wrote:
Guido van Rossum wrote:
What do you mean by "can't be written to disk"?
Disk full was one condition.
I can't be 100% sure of the cause, but I *have* seen this (a bad .pyc file that had to be deleted before the module would import). The .pyc was woefully short but passed the magic test. I think this was 2.1, maybe 2.0.
This was during a firestorm at a client site, so I didn't get around to a bug report.
-- Gordon http://www.mcmillan-inc.com/
[GMcM]
I can't be 100% sure of the cause, but I *have* seen this (a bad .pyc file that had to be deleted before the module would import). The .pyc was woefully short but passed the magic test. I think this was 2.1, maybe 2.0.
Hm... Here's the code responsible for writing .pyc files:
static void write_compiled_module(PyCodeObject *co, char *cpathname, long mtime) { [...] PyMarshal_WriteLongToFile(pyc_magic, fp); /* First write a 0 for mtime */ PyMarshal_WriteLongToFile(0L, fp); PyMarshal_WriteObjectToFile((PyObject *)co, fp); if (ferror(fp)) { /* Don't keep partial file */ fclose(fp); (void) unlink(cpathname); return; } /* Now write the true mtime */ fseek(fp, 4L, 0); PyMarshal_WriteLongToFile(mtime, fp); fflush(fp); fclose(fp); [...] }
It's been like this for a very long time. It always writes the magic number, but withholds the mtime until it's done writing without errors. And if the mtime doesn't match, the .pyc is ignored (unless there's no .py file...).
The only way this could write the correct mtime but not all the marshalled data would be if ferror(fp) doesn't actually indicate an error after a write failure due to a disk full condition. And that's a stdio quality of implementation issue.
I'm not sure if there's anything I could do differently to make this more robust. (I guess I could write the correct magic number at the end too.)
--Guido van Rossum (home page: http://www.python.org/~guido/)