newbie - merging xls files using xldt and xlwt
sjmachin at lexicon.net
Wed Oct 15 15:14:19 CEST 2008
On Oct 15, 9:16 pm, Albert-jan Roskam <fo... at yahoo.com> wrote:
> I wrote the program below to merge all xls files in a given directory into one multisheet xls file. It uses xlwt and xlrd. The xls files I use for input are generated by Spss. When I open and re-save the files in Excel, the program works, but when I use the xls files as they were created by Spss, the program returns an error message (see below). Some details: Python 2.4, xlwt version 0.7.0, xlrd version 0.5.2, Win NT.
> Any ideas?
The version of xlrd that you are using is an antique. Go to
http://pypi.python.org/pypi/xlrd and get the latest version. If that
doesn't work, send me a copy of one of the files that is displaying
> Thanks in advance!
> Merge all xls files in a given directory into one multisheet xls file.
> The sheets get the orginal file name, without the extension.
> File names should not exceed 31 characters, as this is the maximum
> sheet name length
> import xlrd, xlwt
> import glob, os.path
> def merge_xls (in_dir, out_file="d:/merged_output.xls"):
> xls_files = glob.glob(in_dir + "*.xls")
> sheet_names = 
> merged_book = xlwt.Workbook()
> [sheet_names.append(os.path.basename(v)[:-4]) for k, v in enumerate(xls_files)]
Wah! Try this:
sheet_names = [os.path.basename(v)[:-4]) for v in xls_files]
> for k, xls_file in enumerate(xls_files):
> if len (sheet_names[k]) <= 31:
> book = xlrd.open_workbook(xls_file)
> ws = merged_book.add_sheet(sheet_names[k])
> for sheetx in range(book.nsheets):
> sheet = book.sheet_by_index(sheetx)
> for rx in range(sheet.nrows):
> for cx in range(sheet.ncols):
> ws.write(rx, cx, sheet.cell_value(rx, cx))
I presume that you're not too worried about any date data.
If an input file has more than 1 sheet, you are creating only one
sheet in the output file, and overwriting cells.
> *** WARNING: OLE2 inconsistency: SSCS size is 0 but SSAT size is non-zero ***
> put_cell 0 1
> Traceback (most recent call last):
> self._cell_types[rowx][colx] = ctype
> IndexError: list assignment index out of range
Yeah, symptom of an xls file with a DIMENSIONS records that lies ...
xlrd has like Excel become more resilient over time :-)
BTW, consider joining the python-excel group at http://groups.google.com/group/python-excel
More information about the Python-list