[Python-Dev] Identifying magic prefix on Python files?
Eric S. Raymond
esr@thyrsus.com
Sun, 4 Feb 2001 19:34:41 -0500
Tim Peters <tim.one@home.com>:
> > The first eight bytes of a PNG file always contain the following
> > values:
> >
> > (decimal) 137 80 78 71 13 10 26 10
> > (hexadecimal) 89 50 4e 47 0d 0a 1a 0a
> > (ASCII C notation) \211 P N G \r \n \032 \n
>
> Cool! I vote we take it exactly. I don't even know what PNG is, so it's
> doubtful my Windows box will be confused by decorating Python files the same
> way <wink>.
>
> > The first two bytes distinguish PNG files on systems that expect
> > the first two bytes to identify the file type uniquely.
> > The first byte is chosen as a non-ASCII value to reduce the
> > probability that a text file may be misrecognized as a PNG file; also,
> > it catches bad file transfers that clear bit 7.
>
> OK, I suggest (decimal) 143 for Python's first byte. That's a "control
> code" in Latin-1, and (unlike PNG's 137) not even Windows assigns it to a
> character in their Latin-1 superset (yet).
>
> (decimal) 143 80 89 84 13 10 26 10
> (hexadecimal) 8f 50 59 54 0d 0a 1a 0a
> (ASCII C notation) \217 P Y T \r \n \032 \n
\217 is good. It doesn't occur in /usr/share/magic at all, which
is a good sign. Why just PYT, though? Why not spell out "Python"?
That would let us detect case-smashing, too.
--
<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>
False is the idea of utility that sacrifices a thousand real advantages for
one imaginary or trifling inconvenience; that would take fire from men because
it burns, and water because one may drown in it; that has no remedy for evils
except destruction. The laws that forbid the carrying of arms are laws of
such a nature. They disarm only those who are neither inclined nor determined
to commit crimes.
-- Cesare Beccaria, as quoted by Thomas Jefferson's Commonplace book