Have you compared your tool with existing ones, such as https://blog.chezo.uno/tabula-py-extract-table-from-pdf-into-python-datafram... ? What notable difference in API and/or accuracy do you have? On Fri, Sep 28, 2018 at 2:32 PM Vinayak Mehta <vmehta94@gmail.com> wrote:
I've created a Jupyter notebook which shows an example of how Camelot makes it easy to extract tables out of PDFs.
In the example, I scrape a PDF from an Indian disease outbreaks data source[1] using requests, extract tables from each page of the PDF using Camelot and then concat those tables. Here's the gist!https://gist.github.com/vinayak-mehta/e5949f7c2410a0e12f25d3682dc9e873 :)
[1] http://idsp.nic.in/index4.php?lang=1&level=0&linkid=406&lid=3689
On Fri, Sep 28, 2018 at 12:01 PM Vinayak Mehta <vmehta94@gmail.com> wrote:
Hello everyone!
I recently released a Python library which lets users extract data tables out of PDF files, my first open source library! Here's the link: https://github.com/socialcopsdev/camelot
I've created a wiki page <https://github.com/socialcopsdev/camelot/wiki/Comparison-with-other-PDF-Tabl...> comparing it to other open source PDF table extraction tools. I'm currently working on porting it to Python3!
I would be really grateful if you could check it out and see if its useful to you and give me any feedback that may help me improve it, by replying here, opening an issue or a pull request!
Looking forward to hearing from you all!
Thanks for your time!
Vinayak
_______________________________________________ PSF-Community mailing list PSF-Community@python.org https://mail.python.org/mailman/listinfo/psf-community
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.