Robot Has No Heart

Xavier Shay blogs here

A robot that does not have a heart

Three Reasons Why You Shouldn't Use Single Table Inheritance

It creates a cluttered data model. Why don’t we just have one table called objects and store everything as STI? STI tables have a tendency to grow and expand as an application develops, and become intimidating and unweildy as it isn’t clear which columns belong to which models.

It forces you to use nullable columns. A comic book must have an illustrator, but regular books don’t have an illustrator. Subclassing Book with Comic using STI forces you to allow illustrator to be null at the database level (for books that aren’t comics), and pushes your data integrity up into the application layer, which is not ideal.

It prevents you from efficiently indexing your data. Every index has to reference the type column, and you end up with indexes that are only relevant for a certain type.

The only time STI is the right answer is when you have models with exactly the same data, but different behaviour. You don’t compromise your data model, and everything stays neat and tidy. I have yet to see a case in the wild where this rule holds, though.

If you are using STI (or inheritance in general) to share code, you’re doing it wrong. Having many tables does not conflict with the Don’t-Repeat-Yourself principle. Ruby has modules, use them. (I once had a project where a 20 line hash drove the creation of migrations, models, data loaders and test blueprints.)

What you should be doing is using Class Table Inheritance. Rails doesn’t “support it natively”, but that doesn’t particularly mean much since it’s a simple pattern to implement yourself, especially if you take advantage of named scopes and delegators. Your data model will be much easier to work with, easier to understand, and more performant.

I expand on this topic and guide you through a sample implementation in my DB is your friend training course. July through September I am running full day sessions in the US and UK. Chances are I’m coming to your city. Check it out at http://www.dbisyourfriend.com

  1. Clifford Heath says:

    Nullable columns and inefficient indexing can easily be overcome, though Rails gets in the way a bit. In ActiveFacts, "subtype absorption" creates STI-like structures, but mandatory fields of subtypes are enforced through CHECK constraints. Unique indexes over subtype fields cause generation of a view which projects out the indexed fields of the subtype rows, and applies a unique index over that view.

    As for your first complaint, if there is a true subtyping relationship, your code will be cleaner by relying on it. If not, or if you can't keep code clean, well duh, of course it'll be worse.

    ActiveFacts also gives you the option of subtype separation (an extension table for the subtype's fields) or of partitioned subtypes (like CTI), and you can use any mixture of these that makes sense.

  2. Brian Cardarella says:

    I've heard similar arguments against STI before, never really thought any of the arguments were particularly strong. I've found that STI actually creates cleaner models. Especially when it comes to permission or validation issues.

    All that aside, the STI pattern using a NoSQL datastore seems to get around the last two issues. I've been using it in a recent app and have found it to work pretty well. (I don't think it would still be called STI in NoSQL though... maybe SCI?)

  3. Bob says:

    I cant remember who said it but the quote was something along the lines of

    "You can have the logic when you take it from my cold dead Object Oriented fingers".

    I find having nullable columns to be a non-issue since the validation should be done at the application level (in the M of MVC where it belongs).

  4. Jeremy says:

    Well I was doing some stuff just yesterday with the same data model and different code. While I agree with the general sentiment, this situation does come up.

  5. Xavier Shay says:

    Cliff, what is the benefit of adding heaps of constraints on a table (insert performance issue?), over just having multiple tables, especially if they are generated by the framework? Is it just so you don't have to join over the tables?

    Brian, you can have clean code with either approach. I'm interested in clean data, as it simpler to understand and much easier to work with, making applications easier to maintain. NoSQL is a different kettle of fish altogether, for discussion another day.

    Bob, while I appreciate the appeal of doing it all in the application layer, my experience with actual live rails apps is that invariably they experience data integrity issues, especially after hundreds of migrations and thousands of users! Rails validations are far too easy to bypass (even by accident), and it's cheap to add some level of "validation" to the database, making possible nils a total non-issue. Fewer potential bugs, saves me time.

  6. Andy says:

    I have run into a few situations in my primary domain (judicial system) where STI is a very good fit. For example, a defendant, plaintiff, and attorney on a case all have the same attributes (name, address, phones, etc) but their behaviors are different by virtue of their type. Similarly, different types of cases can be described by the same attributes but have very different expectations.

    There is a tendency when you first encounter STI to see everything through that lens. Too many false abstractions emerge to support it. You're absolutely right to guard against it, but be similarly wary of throwing the baby out with the bathwater.

  7. Millisami says:

    Yes, I'd been through with my first rails apps as well.
    Later it started to be a pain and had to redesign the model.

  8. Jared Fine says:

    What about a media sti? For instance, Image, Audio, Video, etc subclass Media. Then I can do things like grab all media for a particular ID and check thier type, render partials etc. How would you deal with this sort of situation?

  9. Xavier Shay says:

    Jared, would need to know more about your domain, but unless the data attributes are exactly the same (probably not, an image doesn't have a length) Class Table Inheritance gives you a clean data model, and the ability to query all of them together (via the base table).

  10. Clifford Heath says:

    Re 'Heaps of constraints", you only need one table-wide CHECK constraint for each subtype, which says "if it's this subtype, all these fields must be non-null". No performance issue there at all, since it's a simple in-memory computation.

    Regarding indexing views, you can use them as targets for FK constraints where a reference is made to the subtype. Plus the subtype might have different unique identifiers anyhow, like employee nr in a table of people who aren't all employees.

    Multiple (partitioned) tables are better when you generally deal with one type at a time and so don't need to UNION your joins. You don't get a single automatic ID across all tables, but you mightn't need that (if you do, multiple subtype extension tables, still with a supertype table, might work). Any of the three ways might work best, but my main point is that STI isn't evil. It's usage dependent - and that's why it should be hidden in the framework, so you can change it without recoding.

  11. Pete Yandell says:

    So is there a nice way to implement class table inheritance in Rails?

  12. Xavier Shay says:

    Pete, I find using composition to work really nicely for the use cases I have needed. You can get a feel for that approach in this post about eager loading with class table inheritance

  13. Søren Houen says:

    Thank you Xavier. I was one migration away from polluting one of my models :)

  14. Mel says:

    These are strange complaints.

    I have not seen "STI tables have a tendency to grow and expand as an application develops". Or at least, certainly not any more than any other table in a Rails app.

    Yes, I have "data integrity up into the application layer", but Rails does that already (there's lots of checks that can't be done by my database at all) -- or I could use database triggers with an STI table, if I really cared about this.

    Whether "Every index has to reference the type column" depends entirely on the fields and use cases and indexes. In my case, they don't. Again, this goes back to your first issue: why not just have a single "objects" table? Because it would be terrible for querying. But STI is great for querying, since I want to query across these objects, and the alternative would be doing lots of SQL "UNION" statements, which Rails doesn't natively support.

    It sounds like you're mis-applying STI. But your issues are all so vague that it's hard to tell.

  15. Eric says:

    Here's another case from the wild where this comes up: subscriptions. We have different subscriptions (pro, premium, trial, etc.) that have identical data, but differ in their logic. STI is a great fit for these objects.

  16. Cameron says:

    Class table inheritance, concrete table inheritance and single table inheritance all have their pros and cons. I suggest reading Martin Fowler's PoEAA for an in-depth look at them. Maybe postgresql's inheritance features could provide a solution to this aspect of the object-relational impedance mismatch?

    http://www.postgresql.org/docs/9.1/static/ddl-inherit.html
    http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch

Post a comment


(lesstile enabled - surround code blocks with ---)

A pretty flower Another pretty flower