Accessing file metadata on windows XP

Tim Golden Tim.Golden at viacom-outdoor.co.uk
Tue Nov 28 05:07:09 EST 2006


[wileyregister22 at gmail.com]

| When rightclicking a, for example, pdf file on windows, one normally
| gets a screen with three or four tags.  Clicking on one of the summary
| tag one can get some info like "title", "Author", "category", 
| "keyword"
| etc..

[warning: not my area of expertise]

That information's held in NTFS Alternate Data Streams.
If you search around with terms like

NTFS (ADS OR "Alternate Data Streams") 

you'll see a whole raft of info on the subject.

In MS Office (and other OLE documents) the information 
is exposed as what's called Structured Storage. I've got 
a bit of a wrapper round it in my winshell module, which 
you could either use direct or simply take as the starting
point for what you're after:

http://timgolden.me.uk/python/winshell.html

or else just search for OLE Structured Storage

Python can read ADS normally; simply specify the alternate
data stream colon syntax when you open a file:

info = open ("temp.pdf:\x05SummaryInformation").read ()

but you have to know what to do with it when you get it.
Sorry, don't have time to play with it right now; hopefully 
someone more knowledgeable can chip in.

TJG

________________________________________________________________________
This e-mail has been scanned for all viruses by Star. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________



More information about the Python-list mailing list