[melbourne-pug] django db race conditions

Noon Silk noonslists at gmail.com
Wed Oct 16 05:14:17 CEST 2013


Practically the celery thing you mention is probably objectively good and
you should do that. But more interestingly, suppose you have:

1. ask for thing
2. if no thing, create thing, do time consuming activity,
3. add or update thing if thing has been added in the meantime.

It's now pretty obvious that you just check again, after the time-consuming
activity. Yeah, there is still a race-condition here, but no more than
there would normally be, I think.




On Wed, Oct 16, 2013 at 12:27 PM, Brian May
<brian at microcomaustralia.com.au>wrote:

> Hello All,
>
> I have a reasonable amount of Django code that follows this general model:
>
> try:
>    object = Model.objects.get(name="woof")
> except Model.DoesNotExist:
>    object = Model()
>    init_object(update)
>    object.save()
>
> Or, in some cases:
>
> object = Model.objects.get_or_create(name="woof")
>
> In both cases the resultant code is very similar.
>
> In both cases there is a race condition. Depending on the flow of
> execution, I can end up with two or more db objects with name="woof". There
> are many forum posts discussing this race condition.
>
> As an example, for the first case happens when displaying a webpage. Lets
> assume init_object() is relatively slow. As the web page takes a while to
> load, the user clicks reload. This results in two (or more) objects being
> created with name="woof" in error.
>
> Another example, for the second case occurs when a JavaScript app makes
> concurrent calls to the web service.
>
> Some people have suggested that if I I want name to be unique, I should
> make it a database constraint. However that is not always the case that I
> want these values to be strictly unique, I just want to reuse an existing
> entry or create it if it doesn't exist. Also, the database constraint would
> mean the code fails instead of committing two objects, which is not really
> helpful.
>
> Other people have suggested locking the db table, while doing the
> get_or_create. Seems to require possible db specific SQL code, am I bit
> reluctant to do this.
>
> Django's select_for_update method is interesting, however as the object
> doesn't actually exist yet, not really applicable.
>
> Another solution I have considered, at least for some cases, is
> moving init_object to a celery task. This would provide the user with
> faster feedback as to what is happening, and for some slow tasks is
> probably a good thing.  Ideally I would only want one task to initialize
> the object, not sure how I would check this without introducing new race
> conditions very similar to the one I am trying to remove. e.g.:
>
> if task not created:
>     create task
>
> In theory create task could be called multiple times.
>
> Another solution, that would work in some places is to make sure that the
> object exists by some other means beforehand. So I can safely do a get
> instead of a get_or_create.
>
> Any other ideas?
>
> Quite possibly I will have to try and find a solution on a case by case
> basis :-(.
>
> Shame we didn't realize this before we wrote this code.
> --
> Brian May <brian at microcomaustralia.com.au>
>
> _______________________________________________
> melbourne-pug mailing list
> melbourne-pug at python.org
> https://mail.python.org/mailman/listinfo/melbourne-pug
>
>


-- 
Noon Silk

Fancy a quantum lunch? https://sites.google.com/site/quantumlunch/

"Every morning when I wake up, I experience an exquisite joy — the joy
of being this signature."
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/melbourne-pug/attachments/20131016/50505d4e/attachment.html>


More information about the melbourne-pug mailing list