Robot Has No Heart

Xavier Shay blogs here

A robot that does not have a heart

acts_as_list will break in production

acts_as_list doesn’t work in a typical production deployment. It pretends to for a while, but every application will eventually have issues with it that result in real problems for your users. Here is a short 4 minute long screencast showing you how it breaks, and also a quick fix which will prevent your data from becoming corrupted.

(View it over at Vimeo if embedding doesn’t work for you)

Here is the “quick fix” I apply in the screencast. It’s ugly, but it will work.

def move_down
  Tractor.transaction do
    @tractor = Tractor.find(params[:id])

Some things to note when fixing your application in a nicer way:

  1. This is not MySQL specific, all databases will exhibit this behaviour.
  2. The isolation level needs to be set as the first statement in the transaction (or globally, but you don’t want serializable globally!)
  3. For bonus points, add a unique index to the position column, though you’ll have to re-implement most of acts_as_list to make it work.
  4. It’s possible to do this under read committed, but it’s pretty complicated and optimised for concurrent access rather than individual performance.
  5. Obtaining a row lock before moving will fix this specific issue, but won’t address all the edge cases.

_This is a small taste of the type of thing I cover in my DB is your friend training course. July through September I am running full day sessions in the US and UK. Chances are I’m coming to your city. Check it out at _

  1. dd says:

    I don't know why you deleted my comment. Apparently it wasn't clear enough: obviously the issue itself is not MySQL-specific, but is "SET SESSION TRANSACTION ISOLATION LEVEL SERIALIZABLE" MySQL-specific? As far as I can tell, it is, but you don't mention that anywhere. I believe (although haven't tested it) that the SQL to execute for PG would be "SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL SERIALIZABLE".

  2. Jules says:

    Isn't there an even easier way of doing it; by just using the 'disable with' option on the button?

    Presumably the double click issue would only be a problem on buttons anyway?

    I'm still a bit of a RoR noob, so I always tend to think of the simplest things first...

  3. Xavier Shay says:

    dd: I updated the post to answer your original comment, so it was no longer necessary. Sorry it wasn't clear to me you were asking about the SQL. SET SESSION ... works for PG also.

    Jules: The double click is the easiest way to replicate this, but not the only way. Two users on different machines could click the button, they might have JS disabled, the action might conflict with a background task. Disabling after click may help your UI (though make sure it keeps working if they hit the back button and need to resubmit the form), but in the same way that you can't rely on client side only validations, you can't rely on client side only fixes for concurrency issues.

  4. Mark Thomas says:

    Are there any alternative gems that behave better, such as acts_as_nested_set, better_nested_set, etc.?

  5. Andrew Hodgkinson says:

    Mark - I've found this page via others due to a massive data loss on a real live client site caused by a related issue. I've been using awesome_nested_set (see Github) to manage trees of information. This includes a manager->worker kind of relationship between user accounts, where administrators manage a pool of agents and those agents each manage pools of artists or clients. I had no idea that the underlying tree code was ignorant of concurrency issues.

    Left and right values became corrupted in the user tree, even though individual parent/child relationships were correct. Since these were intact, corruption spread on for ages without being detected. It was only when an artist user deleted their account that we had the catastrophe. Although they had 'no children', they did have lots of 'descendant' user accounts because of the broken left/right values of that user. The nested set code automatically crops off the branch when you delete part of a tree; filters in the user model code make sure that 'genuine' child users are moved away first, however, this particular artist account *didn't have any children* to move as far as the model code was concerned ("user.children" => "[]").

    So - boom - suddenly a hundred user accounts were deleted by awesome_nested_set in a single SQL statement. It was almost a blessing that this was done below Rails, so that all the user products and profiles were not deleted by dependent-destroy declarations in the user model. The site pretty much immediately fell over, as references to "" and similar evaluated to "", causing numerous exceptions that alerted us to the fault very quickly.

    We restored from backup and lost about three days of changes, fortunately over Christmas which was a quiet period for our site. It could've been *much* worse.

    Ancestry looks to be a promising tree plugin - its documentation is superior, its implementation appears to be sound (wow, comments!) and since it only uses a single column to maintain tree state it ought to be immune to the corruption suffered to date.

  6. mrt says:

    Useful demo!

    Do you happen to know if they sorted out the concurrency issue in the current version of acts_as_list?

Post a comment

(lesstile enabled - surround code blocks with ---)

A pretty flower Another pretty flower