[python-win32] Re: NTFS/MS Word file metadata - no PIDSI for Category?
Earle_Williams at ak.blm.gov
Earle_Williams at ak.blm.gov
Fri Jul 14 23:55:34 CEST 2006
Posting this for posterity: a snippet to read and return the file
properties summary information from a modern Windows file system. It works
for me on Win XP Pro SP2 over NTFS and Active Directory. Thanks to Roger
Upole and Mark Hammond for pointing the way, and my apologies for any
python newb hacks
<pre>
from win32com import storagecon
import pythoncom, os, sys
def get_stats(fname):
author = title = subject = keywords = comments = category = None
try:
pssread=pythoncom.StgOpenStorageEx(fname,
storagecon.STGM_READ|storagecon.STGM_SHARE_EXCLUSIVE,
storagecon.STGFMT_FILE, 0 , pythoncom.IID_IPropertySetStorage)
except:
stg = pythoncom.StgOpenStorage(fname, None,
storagecon.STGM_READ|storagecon.STGM_SHARE_EXCLUSIVE )
try:
pssread = stg.QueryInterface(pythoncom.IID_IPropertySetStorage)
except:
print "No extended storage"
else:
try: ps =
pssread.Open(pythoncom.FMTID_SummaryInformation,storagecon.STGM_READ|storagecon.STGM_SHARE_EXCLUSIVE)
except:
pass
else:
author,title,subject,keywords,comments = ps.ReadMultiple(
(storagecon.PIDSI_AUTHOR, storagecon.PIDSI_TITLE, storagecon.PIDSI_SUBJECT,
storagecon.PIDSI_KEYWORDS, storagecon.PIDSI_COMMENTS) )
try: ps =
pssread.Open(pythoncom.FMTID_DocSummaryInformation,storagecon.STGM_READ|storagecon.STGM_SHARE_EXCLUSIVE)
except:
pass
else:
category = ps.ReadMultiple( (storagecon.PIDDSI_CATEGORY,) )
[0]
return author,title,subject,keywords,comments,category
else:
try: ps =
pssread.Open(pythoncom.FMTID_SummaryInformation,storagecon.STGM_READ|storagecon.STGM_SHARE_EXCLUSIVE)
except:
pass
else:
author,title,subject,keywords,comments = ps.ReadMultiple(
(storagecon.PIDSI_AUTHOR, storagecon.PIDSI_TITLE, storagecon.PIDSI_SUBJECT,
storagecon.PIDSI_KEYWORDS, storagecon.PIDSI_COMMENTS) )
try: ps =
pssread.Open(pythoncom.FMTID_DocSummaryInformation,storagecon.STGM_READ|storagecon.STGM_SHARE_EXCLUSIVE)
except:
pass
else:
category = ps.ReadMultiple( (storagecon.PIDDSI_CATEGORY,) ) [0]
try: ps =
pssread.Open(pythoncom.FMTID_UserDefinedProperties,storagecon.STGM_READ|storagecon.STGM_SHARE_EXCLUSIVE)
except:
pass
else:
pass
return author,title,subject,keywords,comments,category
if __name__=='__main__':
args = sys.argv
try: args[1]
except:
print "Usage: getstats filename"
else:
filename = args[1]
print filename
author,title,subject,keywords,comments,category = get_stats(
filename )
print " Author: %s" % author
print " Title: %s" % title
print " Subject: %s" % subject
print " Keywords: %s" % keywords
print " Comments: %s" % comments
print " Category: %s" % category
</pre>
python-win32-bounces at python.org wrote on 07/13/2006 06:24:38 PM:
> Earle Williams wrote:
> > Hola,
> >
> > I'm trying to pull extended file properties from NTFS or MSWord files.
> > List archives point to snippets from Mark Hammond and Roger Upole, and
I
> > can get to most of the metadata. However I'm having trouble getting to
the
> > 'Category' information. It seems in the NTFS metadata that item is
flagged
> > with a PIDSI_TITLE constant, at least that's what I get with my code
> > (hacked from testStorage.py). If there is no 'Title' info and just
> > Category info, the category info gets read as title.,
> >
> > And in MSWord metadata I can't pull that info at all using Mark
Hammond's
> > DumpStorage snippet. I get everything else but not the 'Category'
data.
> >
> > Anyone have advice on a method to definitively retrieve the category
info?
> >
>
> Category is part of DocSummaryInformation, so you'll need the PIDDSI*
> constants instead of PIDSI*. (PIDDSI_CATEGORY just happens to be
> equal to PIDSI_TITLE)
>
> from win32com import storagecon
> import pythoncom
> fname='c:\\tmp.doc'
>
> pss=pythoncom.StgOpenStorageEx(fname, storagecon.
> STGM_READ|storagecon.STGM_SHARE_EXCLUSIVE,
> storagecon.STGFMT_DOCFILE, 0 , pythoncom.IID_IPropertySetStorage)
> ps=pss.Open(pythoncom.FMTID_DocSummaryInformation)
> print ps.ReadMultiple((storagecon.PIDDSI_CATEGORY,))[0]
>
> Roger
>
>
>
>
> _______________________________________________
> Python-win32 mailing list
> Python-win32 at python.org
> http://mail.python.org/mailman/listinfo/python-win32
More information about the Python-win32
mailing list