How to reduce memory requirement
Dear all, I try to build a bulk transport system with periodic boundary condition by 40 * 40 * 40 sites. It requests over 2 GB memory during transmission calculation. I curious why it produces so large memory requirement. Since I'm going to expand the system up to 50 times, I want to know any way to reduce the memory requirement and increase calculation speed. I'm using kwant 1.1.2 and MUMPS with default PORD ordering. The system building code is syst = kwant.Builder(kwant.TranslationalSymmetry(lat.vec((0, cube.NY, 0)), lat.vec((0, 0, cube.NZ)))) for i in range(cube.NX): for j in range(cube.NY): for k in range(cube.NZ): syst[(lat(i, j, k))] = -cube.data[i, j, k] + 6 * t syst[lat.neighbors(1)] = -t lead = kwant.Builder(kwant.TranslationalSymmetry(lat.vec((-1, 0, 0)), lat.vec((0, cube.NY, 0)), lat.vec((0, 0, cube.NZ)))) lead[lat.shape(lead_shape, (0, 0, 0))] = 6 * t lead[lat.neighbors(1)] = -t syst=wraparound(syst) syst.attach_lead(wraparound(lead,keep=0)) syst.attach_lead(wraparound(lead,keep=0).reversed()) -- Mingkai Li
Dear Mingkai Li, Firstly, 3D systems will take a lot of memory, and most likely you will not be able to straightforwardly increase the system size by 50x without switching to out-of-core or distributed computations. You could improve the memory usage somewhat by also trying the metis or scotch orderings. In 2D they greatly outperform PORD. However I don't think this will bring you close to achieving what you plan. The part that takes memory is storing the factors of the LU-decomposition. For a nested dissection (the asymptotically optimal algorithm), the memory requirement would be ~L^4 log L, with L the linear size of the system, and it would require petabytes of storage. Even switching to the most memory-efficient ordering and discarding unused LU-factors (so effectively doing RGF), you still need to store a dense matrix with size equal to the cross-section of your system, which already gives you a petabyte. Best, Anton On Tue, Mar 14, 2017 at 2:14 AM, Li Mingkai <mingkaili@gmail.com> wrote:
Dear all,
I try to build a bulk transport system with periodic boundary condition by 40 * 40 * 40 sites. It requests over 2 GB memory during transmission calculation. I curious why it produces so large memory requirement. Since I'm going to expand the system up to 50 times, I want to know any way to reduce the memory requirement and increase calculation speed. I'm using kwant 1.1.2 and MUMPS with default PORD ordering. The system building code is
syst = kwant.Builder(kwant.TranslationalSymmetry(lat.vec((0, cube.NY, 0)), lat.vec((0, 0, cube.NZ))))
for i in range(cube.NX): for j in range(cube.NY): for k in range(cube.NZ): syst[(lat(i, j, k))] = -cube.data[i, j, k] + 6 * t
syst[lat.neighbors(1)] = -t
lead = kwant.Builder(kwant.TranslationalSymmetry(lat.vec((-1, 0, 0)), lat.vec((0, cube.NY, 0)), lat.vec((0, 0, cube.NZ)))) lead[lat.shape(lead_shape, (0, 0, 0))] = 6 * t lead[lat.neighbors(1)] = -t
syst=wraparound(syst) syst.attach_lead(wraparound(lead,keep=0)) syst.attach_lead(wraparound(lead,keep=0).reversed())
-- Mingkai Li
Thank Anton. I will try metis or scotch ordering. Best, Migkai Li On Tue, Mar 14, 2017 at 12:51 PM, Anton Akhmerov < anton.akhmerov+kd@gmail.com> wrote:
Dear Mingkai Li,
Firstly, 3D systems will take a lot of memory, and most likely you will not be able to straightforwardly increase the system size by 50x without switching to out-of-core or distributed computations. You could improve the memory usage somewhat by also trying the metis or scotch orderings. In 2D they greatly outperform PORD. However I don't think this will bring you close to achieving what you plan.
The part that takes memory is storing the factors of the LU-decomposition. For a nested dissection (the asymptotically optimal algorithm), the memory requirement would be ~L^4 log L, with L the linear size of the system, and it would require petabytes of storage. Even switching to the most memory-efficient ordering and discarding unused LU-factors (so effectively doing RGF), you still need to store a dense matrix with size equal to the cross-section of your system, which already gives you a petabyte.
Best, Anton
On Tue, Mar 14, 2017 at 2:14 AM, Li Mingkai <mingkaili@gmail.com> wrote:
Dear all,
I try to build a bulk transport system with periodic boundary condition by 40 * 40 * 40 sites. It requests over 2 GB memory during transmission calculation. I curious why it produces so large memory requirement. Since I'm going to expand the system up to 50 times, I want to know any way to reduce the memory requirement and increase calculation speed. I'm using kwant 1.1.2 and MUMPS with default PORD ordering. The system building code is
syst = kwant.Builder(kwant.TranslationalSymmetry(lat.vec((0, cube.NY, 0)), lat.vec((0, 0, cube.NZ))))
for i in range(cube.NX): for j in range(cube.NY): for k in range(cube.NZ): syst[(lat(i, j, k))] = -cube.data[i, j, k] + 6 * t
syst[lat.neighbors(1)] = -t
lead = kwant.Builder(kwant.TranslationalSymmetry(lat.vec((-1, 0, 0)), lat.vec((0, cube.NY, 0)), lat.vec((0, 0, cube.NZ)))) lead[lat.shape(lead_shape, (0, 0, 0))] = 6 * t lead[lat.neighbors(1)] = -t
syst=wraparound(syst) syst.attach_lead(wraparound(lead,keep=0)) syst.attach_lead(wraparound(lead,keep=0).reversed())
-- Mingkai Li
-- Mingkai Li
participants (2)
-
Anton Akhmerov
-
Li Mingkai