Accessing file metadata on windows XP
Tim Golden
Tim.Golden at viacom-outdoor.co.uk
Tue Nov 28 05:07:09 EST 2006
[wileyregister22 at gmail.com]
| When rightclicking a, for example, pdf file on windows, one normally
| gets a screen with three or four tags. Clicking on one of the summary
| tag one can get some info like "title", "Author", "category",
| "keyword"
| etc..
[warning: not my area of expertise]
That information's held in NTFS Alternate Data Streams.
If you search around with terms like
NTFS (ADS OR "Alternate Data Streams")
you'll see a whole raft of info on the subject.
In MS Office (and other OLE documents) the information
is exposed as what's called Structured Storage. I've got
a bit of a wrapper round it in my winshell module, which
you could either use direct or simply take as the starting
point for what you're after:
http://timgolden.me.uk/python/winshell.html
or else just search for OLE Structured Storage
Python can read ADS normally; simply specify the alternate
data stream colon syntax when you open a file:
info = open ("temp.pdf:\x05SummaryInformation").read ()
but you have to know what to do with it when you get it.
Sorry, don't have time to play with it right now; hopefully
someone more knowledgeable can chip in.
TJG
________________________________________________________________________
This e-mail has been scanned for all viruses by Star. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________
More information about the Python-list
mailing list