[Python-Dev] Identifying magic prefix on Python files?

Tim Peters tim.one@home.com
Sun, 4 Feb 2001 19:07:39 -0500


[Eric S. Raymond]
> ...
> What I'd like to throw in the pot is the cleverest file signature
> design I've ever seen -- PNG's.  Here's a quote from the PNG spec:
>
> ------------------------------------------------------------------
> The first eight bytes of a PNG file always contain the following
> values:
>
>    (decimal)              137  80  78  71  13  10  26  10
>    (hexadecimal)           89  50  4e  47  0d  0a  1a  0a
>    (ASCII C notation)    \211   P   N   G  \r  \n \032 \n

Cool!  I vote we take it exactly.  I don't even know what PNG is, so it's
doubtful my Windows box will be confused by decorating Python files the same
way <wink>.

> The first two bytes distinguish PNG files on systems that expect
> the first two bytes to identify the file type uniquely.
> The first byte is chosen as a non-ASCII value to reduce the
> probability that a text file may be misrecognized as a PNG file; also,
> it catches bad file transfers that clear bit 7.

OK, I suggest (decimal) 143 for Python's first byte.  That's a "control
code" in Latin-1, and (unlike PNG's 137) not even Windows assigns it to a
character in their Latin-1 superset (yet).

    (decimal)              143  80  89  84  13  10  26  10
    (hexadecimal)           8f  50  59  54  0d  0a  1a  0a
    (ASCII C notation)    \217   P   Y   T  \r  \n \032 \n