[CentralOH] Django Management Command Memory Usage

Eric Floehr eric at intellovations.com
Mon Jun 4 17:09:51 CEST 2012


I should say GeoDjango supports a number of GIS implementations (not
PostGIS, which is a specific implementation) :-) ...


On Mon, Jun 4, 2012 at 11:08 AM, Eric Floehr <eric at intellovations.com>wrote:

>
> Running as raw SQL will save memory (and possibly time) as you aren't
> loading each row from database to Python and back.  The drawback would be
> you would be tied to whatever GIS implementation you are using (PostGIS
> supports a number).  Here would be the equivalent PostGIS SQL to do this:
>
> update <Places DB table> set point=ST_SetSRID(ST_Point(x/1000.0,
> y/1000.0), <SRID>) where x is not NULL and y is not NULL;
>
> The default <SRID> in PostGIS is 4326, so that's likely what your point
> column will expect.
>
> A couple of notes:
>
> ST_Point, and most other PostGIS commands expect longitude,latitude form
> ... so in this case, x would be longitude and y latitude.
>
> Also, in your Python implementation, "if record.x and record.y" will fail
> when x or y is 0, which are valid, not just None.  This has tripped me up
> in the past :-).  So better would be "if record.x is not None and record.y
> is not None".
>
> Cheers,
> Eric
>
>
>
>
> On Mon, Jun 4, 2012 at 10:39 AM, Kurtis Mullins <kurtis.mullins at gmail.com>wrote:
>
>> Hey,
>>
>> It looks like you've got the fat trimmed off this one about as much as
>> you can. I'd say filter out your results but even then you're still using
>> every result. I'd recommend modifying this to run as raw SQL and I'm sure
>> your memory usage will go down significantly. Pulling in this many records
>> as Python Objects and conversely creating new Python objects on top of that
>> (your Point objects) is most likely the cause of this memory usage.
>>
>>
>> On Mon, Jun 4, 2012 at 10:28 AM, <jep200404 at columbus.rr.com> wrote:
>>
>>> How can I reduce the memory usage in a Django management command?
>>> I have some Django code like follows in a management program:
>>>
>>> class Command(BaseCommand):
>>> ...
>>>    def handle(self, *args, **options):
>>>        for record in Places.objects.all():
>>>            if record.x and record.y:
>>>                record.point = (
>>>                    Point(float(record.x)/1000.,
>>>                    float(record.y)/1000.))
>>>            else:
>>>                record.point = None
>>>            record.save()
>>>        django.db.connection.close()
>>>
>>> In the settings.py file I have:
>>>
>>> DEBUG = False
>>>
>>> Places has millions of rows.
>>> top reveals that the program is using 18.6 Gigabytes of memory.
>>> How can I reduce that memory usage?
>>> Am I neglecting to close or release something?
>>>
>>> The only dox I'm finding about memory use related to query sets
>>> advise to use iterators instead of converting to a list.
>>> I'm already following that advice, but I'm not finding
>>> further guidance about memory use about record modification.
>>>
>>> Since DEBUG is False, I've already heeding the following.
>>>
>>>
>>> https://docs.djangoproject.com/en/dev/faq/models/#why-is-django-leaking-memory
>>>
>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>>>
>>> I have found that the following might be nice,
>>> but doubt it addresses the memory issue.
>>>
>>>            record.save(update_fields=['point'])
>>>
>>> _______________________________________________
>>> CentralOH mailing list
>>> CentralOH at python.org
>>> http://mail.python.org/mailman/listinfo/centraloh
>>>
>>
>>
>> _______________________________________________
>> CentralOH mailing list
>> CentralOH at python.org
>> http://mail.python.org/mailman/listinfo/centraloh
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/centraloh/attachments/20120604/ce953977/attachment.html>


More information about the CentralOH mailing list