Page layout in Python

maurog qualsivoglia at
Fri Jul 25 16:01:48 CEST 2014

The first step in grabbing information from a pdf file is to translate it 
into text format with pdftotext -layout command. 
Is it available any specific python tool or library to describe the 
layout of a page with ascii characters and to help in identifying and 
extracting the useful pieces of information? For example a function 
allowing to select N characters at line I starting from column Y. 

If a such tool is not available, what is in your mind the best structure 
to describe in python a two dimensions page layout?

More information about the Python-list mailing list