etree, gzip, and BytesIO
Frank Millman
frank at chagford.com
Thu Jan 21 01:22:08 EST 2021
Hi all
This question is mostly to satisfy my curiosity.
In my app I use xml to represent certain objects, such as form
definitions and process definitions.
They are stored in a database. I use etree.tostring() when storing them
and etree.fromstring() when reading them back. They can be quite large,
so I use gzip to compress them before storing them as a blob.
The sequence of events when reading them back is -
- select gzip'd data from database
- run gzip.decompress() to convert to a string
- run etree.fromstring() to convert to an etree object
I was wondering if I could avoid having the unzipped string in memory,
and create the etree object directly from the gzip'd data. I came up
with this -
- select gzip'd data from database
- create a BytesIO object - fd = io.BytesIO(data)
- use gzip to open the object - gf = gzip.open(fd)
- run etree.parse(gf) to convert to an etree object
It works.
But I don't know what goes on under the hood, so I don't know if this
achieves anything. If any of the steps involves decompressing the data
and storing the entire string in memory, I may as well stick to my
present approach.
Any thoughts?
Frank Millman
More information about the Python-list
mailing list