[Chicago] scrape power point
Brian Ray
brianhray at gmail.com
Thu Sep 23 17:57:11 CEST 2010
On Sep 23, 2010, at 10:44 AM, Carl Karsten <carl at personnelware.com> wrote:
>> Getting data isn't hard, it's the metadata that's difficult. I have lots of existing (mostly) HTML, Excel Spreadsheets, and Word docs, and Power Point
>
> Do you have python code to scrape the text from Power Point files?
>
> I would like to be able to scrape the text from Power Point, Keynote
> and whatever else a presenter might use for PyCon talks. I am sure
> its a previously solved problem, but it is currently low on my list of
> things to even google.
>
I recall a Google Docs Hack for this. I think it just takes a URL to them with a location of your PPT. Returns HTML. Then BeautifulSoup it :)
More information about the Chicago
mailing list