Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API
Thomas Passin
list1 at tompassin.net
Mon Sep 30 13:57:05 EDT 2024
On 9/30/2024 1:00 PM, Chris Angelico via Python-list wrote:
> On Tue, 1 Oct 2024 at 02:20, Thomas Passin via Python-list
> <python-list at python.org> wrote:
>>
>> On 9/30/2024 11:30 AM, Barry via Python-list wrote:
>>>
>>>
>>>> On 30 Sep 2024, at 06:52, Abdur-Rahmaan Janhangeer via Python-list <python-list at python.org> wrote:
>>>>
>>>>
>>>> import polars as pl
>>>> pl.read_json("file.json")
>>>>
>>>>
>>>
>>> This is not going to work unless the computer has a lot more the 60GiB of RAM.
>>>
>>> As later suggested a streaming parser is required.
>>
>> Streaming won't work because the file is gzipped. You have to receive
>> the whole thing before you can unzip it. Once unzipped it will be even
>> larger, and all in memory.
>
> Streaming gzip is perfectly possible. You may be thinking of PKZip
> which has its EOCD at the end of the file (although it may still be
> possible to stream-decompress if you work at it).
>
> ChrisA
You're right, that's what I was thinking of.
More information about the Python-list
mailing list