[Tutor] need help generating table of contents

Cameron Simpson cs at cskk.id.au
Sat Aug 25 18:40:58 EDT 2018


On 24Aug2018 17:55, Peter Otten <__peter__ at web.de> wrote:
>Albert-Jan Roskam wrote:
>> I have Ghostscript files with a table of contents (toc) and I would like
>to use this info to generate a human-readable toc. The problem is: I can't
>get the (nested) hierarchy right.
>>
>> import re
>>
>> toc = """\
>> [ /PageMode /UseOutlines
>>   /Page 1
>>   /View [/XYZ null null 0]
>>   /DOCVIEW pdfmark
>> [ /Title (Title page)
>>   /Page 1
>>   /View [/XYZ null null 0]
>>   /OUT pdfmark
>> [ /Title (Document information)
>>   /Page 2
>>   /View [/XYZ null null 0]
>>   /OUT pdfmark
[...]
>> What is the best approach to do this?
>
>The best approach is probably to use some tool/library that understands
>postscript.

Just to this: I disagree. IIRC, there's no such thing as '/Title' etc in 
PostScript - these will all be PostScript functions defined by whatever made 
the document.  So a generic tool won't have any way to extract semantics like 
titles from a document.

The OP presumably has the specific output of a particular tool with this nice 
well structured postscript, so he needs to write his/her own special parser.

Cheers,
Cameron Simpson <cs at cskk.id.au>


More information about the Tutor mailing list