memory control in Python
Oscar Benjamin
oscar.j.benjamin at gmail.com
Tue Aug 18 13:21:52 CEST 2015
On 15 August 2015 at 00:41, Ping Liu <yanzhipingliu at gmail.com> wrote:
> Dear All,
Hi Ping Liu,
> I am working on an optimization problem, where we are trying to minimize
> some indicators like energy usage, energy cost, CO2 emission. In this
> problem, we have a bunch of energy conversion technologies for electricity
> and thermal purpose, such as heat pump, boiler, chiller, etc.. We are trying
> to model it with a one year time period. the current time step is one hour.
I guess this means that you are running a simulation over a period of
1 year. If the timestep is one hour then the number of timesteps is
24*365 = 8760.
> We have preselected the candidate technologies to exclude those we don't
> want to study so that the problem size could be reduced with a limited
> candidate technologies. In the current case study, we only analyze the
> electric chiller and heat pump to supply the cooling load, while power grid
> will supply the electricity for all electric loads. There are binary
> variables regarding installation decisions of technologies and continuous
> variables related to installation capacity and hourly operational decisions.
How many binary variables do you have? I ask because it may be
preferable to consider each case of the binary variables separately.
Suppose that there are 5 binary variables. Then that makes 2**5 = 32
possible cases. It might be simpler to solve the continuous
optimisation problem for each of the 32 cases separately. On the other
hand if you have 20 binary variables then you'll have 1e6 cases to
consider in which case it is less likely to be feasible to consider
each one.
> For small cases, Python works well. But if we consider longer time period.
> then it would fail due to the memory usage issues. We have tested several
> case studies to check the memory use for different time period, including 1)
> 2 hours in one day, 2) 24 hours in one day, 3) 20 days with 24 hours each
> day, as well as 4) 30 days with 24 hours each day. The first 3 cases are
> feasible while the last case gives out the memory error.
If I understand correctly what you mean is that you are attempting
different length simulations. In the first case you have 2 timesteps
meaning a simulation of 2 hours, then 24, then 480, and then 720. Your
target is to get to 8760 timesteps. Is that correct?
> When we are testing the forth case, the memory error comes out while
> creating the inequality constraints. The problem size is 1) Aeq: 12 * 26,
> Aineq: 30 * 26; 2) Aeq: 144*268, Aineq:316*268; 3) Aeq: 2880*5284, Aineq:
> 6244*5284; 4) Aeq: 4320 * 7924, Aineq is unknown due to the memory error.
I assume that this refers to the constraints that you pass to the
solver which varies depending on the number of timesteps in your
simulation: Aeq refers to the number of equality constraints and Aineq
to the number of inequality constraints?
The number of constraints is (timesteps, equality, inequality):
2 12*26 30*26
24 144*268 316*268
480 2880*5284 6244*5284
720 4320*7924 unknown
So the pattern for N timesteps seems to be:
N (6*N)*(11*N+4) (13*N+4)*(11*N+4)
So I guess the number of inequality constraints in the case of memory
error is 9364*7924.
Essentially what this shows is that your problem is complexity is N**2
where N is the number of timesteps. Even if you can get the memory to
go up to 720 timesteps you still need 10 times more steps to get to
your target of 8760. Since the problem is growing quadratically you'll
need to have 100 times more computer memory again to reach your
target.
Your problem scales very badly and is probably badly posed. Is it
really necessary to impose so many constraints or can the problem be
formulated in a way that uses fewer?
> The solver is CPLEX (academic). It turns out that the solver is taking a lot
> of memory as you can see in the memory test report. for the first three
> cases, different memory usage is observed, and it grows up dramatically with
> the increase of the time period. 1) solver memory usage: 25.6 MB, 2) 19.5
> MB; 3) solver memory usage: 830.0742 MB.
This is unsurprising since the size of your problem grows
quadratically. In the 720 timestep case you are trying to optimise
with 100 million constraints. For your target of 1 year the number of
constraints would be 16 billion. The amount of memory required to
store these constraints is proportional to the number of constraints
but the time taken to optimise with the constraints will probably grow
even faster so even if you had the memory you're unlikely to have the
CPU power to solve this.
> Based on my observations, I have some of the following questions regarding
> 1) In order to create the optimization problem (Aeq, Aineq), how can we
> reduce the memory usage in python programming? 2) how can I reduce the
> memory usage of different solvers in Python? Why would the memory decrease
> for the CPLEX solver within the 24-hours-in-one-day case compared with the
> case 1?
Others have already pointed out that your problem is not really about
Python at all. There's no point in trying to use PyPy, IronPython etc.
It also doesn't matter if you ditch Python and use C or Matlab or some
other language. As long as you're using the same basic algorithm and
approach I don't think you will be able to reach your target of 1
year.
I think that you need to approach this whole problem in a different
way. Either reformulate it so that there are fewer constraints or
choose a different algorithm. Does the CPLEX library only have one
solver routine (you may be able to simply select a different
algorithm)? Perhaps it's just not necessary to use CPLEX and you can
use a different optimisation software or even a completely different
way to pose the problem so that it's not even an optimisation problem
at all.
--
Oscar
More information about the Python-list
mailing list