newbie - merging xls files using xldt and xlwt

John Machin sjmachin at lexicon.net
Wed Oct 15 15:14:19 CEST 2008


On Oct 15, 9:16 pm, Albert-jan Roskam <fo... at yahoo.com> wrote:
> Hi,
>
> I wrote the program below to merge all xls files in a given directory into one multisheet xls file. It uses xlwt and xlrd. The xls files I use for input are generated by Spss. When I open and re-save the files in Excel, the program works, but when I use the xls files as they were created by Spss, the program returns an error message (see below). Some details: Python 2.4, xlwt version 0.7.0, xlrd version 0.5.2, Win NT.
>
> Any ideas?

The version of xlrd that you are using is an antique. Go to
http://pypi.python.org/pypi/xlrd and get the latest version. If that
doesn't work, send me a copy of one of the files that is displaying
this problem.
>
> Thanks in advance!
>
> Albert-Jan
>
> """
> Merge all xls files in a given directory into one multisheet xls file.
> The sheets get the orginal file name, without the extension.
> File names should not exceed 31 characters, as this is the maximum
> sheet name length
> """
>
> import xlrd, xlwt
> import glob, os.path
>
> def merge_xls (in_dir, out_file="d:/merged_output.xls"):
>
>     xls_files   = glob.glob(in_dir + "*.xls")
>     sheet_names = []
>     merged_book = xlwt.Workbook()
>
>     [sheet_names.append(os.path.basename(v)[:-4]) for k, v in enumerate(xls_files)]

Wah! Try this:
sheet_names = [os.path.basename(v)[:-4]) for v in xls_files]


>     for k, xls_file in enumerate(xls_files):
>         if len (sheet_names[k]) <= 31:
>             book = xlrd.open_workbook(xls_file)
>             ws = merged_book.add_sheet(sheet_names[k])
>             for sheetx in range(book.nsheets):
>                 sheet = book.sheet_by_index(sheetx)
>                 for rx in range(sheet.nrows):
>                     for cx in range(sheet.ncols):
>                          ws.write(rx, cx, sheet.cell_value(rx, cx))

I presume that you're not too worried about any date data.

If an input file has more than 1 sheet, you are creating only one
sheet in the output file, and overwriting cells.

>         else:
>[snip]
>
> merge_xls(in_dir="d:/temp/")
>
> *** WARNING: OLE2 inconsistency: SSCS size is 0 but SSAT size is non-zero ***
> put_cell 0 1
>
> Traceback (most recent call last):
[snip]
>     self._cell_types[rowx][colx] = ctype
> IndexError: list assignment index out of range

Yeah, symptom of an xls file with a DIMENSIONS records that lies ...
xlrd has like Excel become more resilient over time :-)

BTW, consider joining the python-excel group at http://groups.google.com/group/python-excel

Cheers,
John



More information about the Python-list mailing list