Errors with PyPdf
flebber
flebber.crue at gmail.com
Mon Sep 27 10:19:34 EDT 2010
On Sep 27, 2:46 pm, Dave Angel <da... at ieee.org> wrote:
> On 2:59 PM, flebber wrote:
>
> > <snip>
> > Traceback (most recent call last):
> > File "C:/Python26/Pdfread", line 16, in<module>
> > open('x.txt', 'w').write(content)
> > NameError: name 'content' is not defined
> > When i use.
>
> > import pyPdf
>
> > def getPDFContent(path):
> > content =C:\Components-of-Dot-NET.txt"
> > # Load PDF into pyPDF
> > pdf =yPdf.PdfFileReader(file(path, "rb"))
> > # Iterate pages
> > for i in range(0, pdf.getNumPages()):
> > # Extract text from page and add to content
> > content +=df.getPage(i).extractText() + "\n"
> > # Collapse whitespace
> > content = ".join(content.replace(u"\xa0", " ").strip().split())
> > return content
>
> > print getPDFContent(r"C:\Components-of-Dot-NET.pdf").encode("ascii",
> > "ignore")
> > open('x.txt', 'w').write(content)
>
> There's no global variable content, that was local to the function. So
> it's lost when the function exits. it does return the value, but you
> give it to print, and don't save it anywhere.
>
> data = getPDFContent(r"C:\Components-of-Dot-NET.pdf").encode("ascii",
> "ignore")
>
> outfile = open('x.txt', 'w')
> outfile.write(data)
>
> close(outfile)
>
> I used a different name to emphasize that this is *not* the same
> variable as content inside the function. In this case, it happens to
> have the same value. And if you used the same name, you could be
> confused about which is which.
>
> DaveA
Thank You everyone.
More information about the Python-list
mailing list