XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'\r\n\r\n\r\n\r\n'
J.O. Aho
user at example.net
Wed Sep 29 08:11:49 EDT 2021
On 29/09/2021 13.10, hongy... at gmail.com wrote:
> On Wednesday, September 29, 2021 at 5:40:58 PM UTC+8, J.O. Aho wrote:
>> On 29/09/2021 10.22, hongy... at gmail.com wrote:
>>> I tried to convert a xls file into csv with the following command, but failed:
>>>
>>> $ in2csv --sheet 'Sheet1' 2021-2022-1.xls
>>> XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'\r\n\r\n\r\n\r\n'
>>>
>>> The above testing file is located at here [1].
>>>
>>> [1] https://github.com/hongyi-zhao/temp/blob/master/2021-2022-1.xls
>>>
>>> Any hints for fixing this problem?
>> You need to delete the 13 first lines in the file
>
> Yes. After deleting the top 3 lines, the problem has been fixed.
>
>> or you see to that your code does first trim the data before start xml parse it.
>
> Yes. I really want to do this trick programmatically, but how do I do it without manually editing the file?
You could do something like loading the XML into a string (myxmlstr) and
then find the fist < in that string
xmlstart = myxmlstr.find('<')
xmlstr = myxmlstr[xmlstart:]
then use the xmlstr in the xml parser, sure not as convenient as loading
the file directly to the xml parser.
I don't say this is the best way of doing it, I'm sure some python wiz
here would have a smarter solution.
--
//Aho
More information about the Python-list
mailing list