PDF: finding a blank image

DrLeif l.lensgraf at gmail.com
Tue Jul 14 16:04:16 CEST 2009


On Jul 13, 6:22 pm, Scott David Daniels <Scott.Dani... at Acm.Org> wrote:
> DrLeif wrote:
> > I have about 6000 PDF files which have been produced using a scanner
> > with more being produced each day.  The PDF files contain old paper
> > records which have been taking up space.   The scanner is set to
> > detect when there is information on the backside of the page (duplex
> > scan).  The problem of course is it's not the always reliable and we
> > wind up with a number of PDF files containingblankpages.
>
> > What I would like to do is have python detect a "blank" pages in a PDF
> > file and remove it.  Any suggestions?
>
> I'd check into ReportLab's commercial product, it may well be easily
> capable of that.  If no success, you might contact PJ at Groklaw, she
> has dealt with a _lot_ of PDFs (and knows people who deal with PDFs
> in bulk).
>
> --Scott David Daniels
> Scott.Dani... at Acm.Org


Thanks everyone for the quick reply.

I had considered using ReportLab however, was uncertain about it's
ability to detect a blank page.

Scott, I'll drop an email to ReportLab and PJ....

Thanks again,
DrLeif



More information about the Python-list mailing list