Finding related content with Sphinx
Previous efforts to find related posts with the classifier gem yielded no fruit, so I tried another approach using sphinx. Turned out to be a winner.
The basic theory is to index all posts by tag, then to find related posts just use the current post’s tags as a search string. Remember to exclude the current post from the search results. For this blog, I use tags for the main categories, which were corrupting the results – most everything is tagged ‘Ruby’ so it doesn’t add any value in determining likeness. So rather than indexing all tags I excluded some of the main ones.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
class Post < ActiveRecord::Base has_many :searchable_tags, :through => :taggings, :source => :tag, :conditions => "tags.name NOT IN ('Ruby', 'Code', 'Life')" def related_posts(number = 3) Post.search(:limit => number + 1, :conditions => { :tag_list => tag_list.join("|") }).reject {|x| x == self }.first(number) end define_index do indexes searchable_tags(:name), :as => :tag_list # If you want to use this for normal search as well you'll have to # add in title/body here as well end end |
For a more complete example, see the relevant RHNH commits: cdc0bf and d4d844
Showing links to related content is a good way to stop the bottom of your page from being a ‘dead end’. In the event that no related posts are found, I’m linking to the archives instead.