I want to calc multiple ndarrays at once and lack memory, so want to write in chunks (here sized to GPU batch capacity). It seems there should be an interface to write the header, then write a number of elements cyclically, then add any closing rubric and close the file.
Is it as simple as lib.format.write_array_header_2_0(fp, d)
then writing multiple shape(N,) arrays of float by fp.write(item.tobytes())?