Read a content file from a P7M

Luca lucafbb at gmail.com
Fri Aug 28 09:33:39 EDT 2009


On Fri, Mar 20, 2009 at 4:12 PM, Emanuele Rocca<ema at linux.it> wrote:
> On 11/03/09 - 05:05, Luca wrote:
>> There is standard or sugested way in python to read the content of a P7M file?
>>
>> I don't need no feature like verify sign, or sign using a certificate.
>> I only need to extract the content file of the p7m (a doc, a pdf, ...)
>
> For PDF files you can just remove the P7M content before %PDF and after
> %%EOF.
>
> The following snippet converts /tmp/test.p7m into PDF, saving the
> resulting document into /tmp/test.pdf:
>
> import re
> from gzip import GzipFile
>
> contents = GzipFile('/tmp/test.p7m').read()
>
> contents_re = re.compile('%PDF-.*%%EOF', re.MULTILINE | re.DOTALL)
> contents = contents_re.search(contents).group()
>
> open('/tmp/test.pdf', 'w').write(contents)
>

After all those days... only to say THANKS!
The example of the PDF file is perfect, I only needed to not execute
the GzipFile call (it seems that our PDF are not GZipped).

Unluckily now seems that we need the same feature for non-pdf files...

-- 
-- luca



More information about the Python-list mailing list