I'm working on a small script that so far, using the xlrd module, (<a href="http://www.lexicon.net/sjmachin/xlrd.html">http://www.lexicon.net/sjmachin/xlrd.html</a>) will parse all the files in a given directory for a xls file with a specific worksheet. This way, if the file names change, or people don't save the spreadsheet with the right name, my script will still be able to locate the correct files to use for it's data source out of multiple files / versions. So far what I have sort of goes like this :
<br><br>import os<br>import xlrd<br><br>data = {}<br><br>#path may be set externally at some point<br>data['path'] = 'mypath_to_program'<br><br>os.chdir(data['path'])<br><br>data['xls_files'] = [ file for file in
os.listdir('./') if '.xls' in file ]<br><br>first_files = [ file for file in data['xls_files'] if u'First Worksheet' in xlrd.open_workbook(file).sheet_names() ]<br>data['first_file'] = ??
<br><br>second_files = [ file for file in data['xls_files'] if u'Second Worsheet' in xlrd.open_workbook(file).sheet_names() ]<br>data['second_file'] = ??<br><br>This is where I get stuck, I'm trying to figure out how, from the files
that match, I can select the file with the most current time stamp and
use that as my main data file.<br>I know I can get the modification time with os.stat(file).st_mtime, but I'm not sure how I can sort my returns by this, to get just the most current version. Any help / thoughts would be appreciated. I'm going to be looking for other worksheets as well that might be in
other xls's, for example 'Second Worksheet' also, but I was just trying to get the 'first_files' working first. Instead of opening them each time, should I construct some type of data that stores the file, it's worksheets, and its modification times for each file found, and then just parse that list? like maybe change my xls_files around to not just be a list of names?
<br>