Robot Has No Heart

Xavier Shay blogs here

A robot that does not have a heart

Rake tab completion with caching and namespace support

UPDATE: It now invalidates the cache if you touch lib/tasks/*.rake, for those using it with rails (like me)

There’s a few articles on the net regarding rake tab completion, I had to combine a few of them to get what I wanted:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
#!/usr/bin/env ruby

# Complete rake tasks script for bash
# Save it somewhere and then add
# complete -C path/to/script -o default rake
# to your ~/.bashrc
# Xavier Shay (http://rhnh.net), combining work from
#   Francis Hwang ( http://fhwang.net/ ) - http://fhwang.net/rb/rake-complete.rb
#   Nicholas Seckar <nseckar@gmail.com>  - http://www.webtypes.com/2006/03/31/rake-completion-script-that-handles-namespaces
#   Saimon Moore <saimon@webtypes.com>

require 'fileutils'

RAKEFILES = ['rakefile', 'Rakefile', 'rakefile.rb', 'Rakefile.rb']
exit 0 unless RAKEFILES.any? { |rf| File.file?(File.join(Dir.pwd, rf)) }
exit 0 unless /^rake\b/ =~ ENV["COMP_LINE"]

after_match = $'
task_match = (after_match.empty? || after_match =~ /\s$/) ? nil : after_match.split.last
cache_dir = File.join( ENV['HOME'], '.rake', 'tc_cache' )
FileUtils.mkdir_p cache_dir
rakefile = RAKEFILES.detect { |rf| File.file?(File.join(Dir.pwd, rf)) }
rakefile_path = File.join( Dir.pwd, rakefile )
cache_file = File.join( cache_dir, rakefile_path.gsub( %r{/}, '_' ) )
if File.exist?( cache_file ) &&
   File.mtime( cache_file ) >= (Dir['lib/tasks/*.rake'] << rakefile).collect {|x| File.mtime(x) }.max
  task_lines = File.read( cache_file )
else
  task_lines = `rake --silent --tasks`
  File.open( cache_file, 'w' ) do |f| f << task_lines; end
end
tasks = task_lines.split("\n")[1..-1].collect {|line| line.split[1]}
tasks = tasks.select {|t| /^#{Regexp.escape task_match}/ =~ t} if task_match

# handle namespaces
if task_match =~ /^([-\w:]+:)/
  upto_last_colon = $1
  after_match = $'
  tasks = tasks.collect { |t| (t =~ /^#{Regexp.escape upto_last_colon}([-\w:]+)$/) ? "#{$1}" : t }
end

puts tasks
exit 0

Finding related content with Sphinx

Previous efforts to find related posts with the classifier gem yielded no fruit, so I tried another approach using sphinx. Turned out to be a winner.

The basic theory is to index all posts by tag, then to find related posts just use the current post’s tags as a search string. Remember to exclude the current post from the search results. For this blog, I use tags for the main categories, which were corrupting the results – most everything is tagged ‘Ruby’ so it doesn’t add any value in determining likeness. So rather than indexing all tags I excluded some of the main ones.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class Post < ActiveRecord::Base
  has_many :searchable_tags, 
           :through    => :taggings,
           :source     => :tag,
           :conditions => "tags.name NOT IN ('Ruby', 'Code', 'Life')"
  
  def related_posts(number = 3)
    Post.search(:limit => number + 1, :conditions => {
      :tag_list => tag_list.join("|")
    }).reject {|x| x == self }.first(number)
  end

  define_index do
    indexes searchable_tags(:name), :as => :tag_list
    # If you want to use this for normal search as well you'll have to 
    # add in title/body here as well
  end
end

For a more complete example, see the relevant RHNH commits: cdc0bf and d4d844

Showing links to related content is a good way to stop the bottom of your page from being a ‘dead end’. In the event that no related posts are found, I’m linking to the archives instead.

New Blender

I bought myself a shiny new blender on the weekend. Yay, hardware. It was a toss up between the sunbeam and a model (brand forgotten) with a round vessel. This model had less wattage. Neither my brother nor I have any idea what makes a good blender. They both had 6 blades. Juice bars use square vessels. And they blend a lot of stuff. So I went with the square one. It makes good smoothies. And desserts. Here it is pictured in preparation to make a green smoothie – a vegan classic.

Blender

  • 2 bananas
  • 1 handful of spinach
  • 1 cup of water (enough to thin it out a bit – might need more)

Blend! So yeah it’s green. Get over it. Tastes like banana. Mmmm, delicious.

Hash trumps case

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# Two equivalent functions
def rgb(color)
  case color
    when :red   then 'ff0000'
    when :green then '00ff00'
    when :blue  then '0000ff'
    else             '000000' # Default to black
  end
end

def rgb2(color)
  {
    :red   => 'ff0000',
    :green => '00ff00',
    :blue  => '0000ff'
  }[color] || '000000'
end

Even though these functions are equivalent, the second carries more semantic weight – it maps a symbol directly to a color. The case sample makes no such guarantees since you can execute any arbitrary code in the then block. In addition, a hash is easier to work with – you can easily iterate over the keys, extract to another method if you need reuse, or query it for other properties (for example, 3 colors are available). It is also easier to read – both aesthetically and because it contains fewer tokens. In almost all circumstances I will prefer a hash over a case statement.

Relationships in data are easier to comprehend and manipulate than relationships in code.

Contextual Composition With Delegation

I’ve had some models getting rather large recently. This makes them hard to comprehend and makes the source difficult to browse. A lot of the time, a big chunk of functionality is fairly context specific – it is only relevant to one particular part of my application (reporting, data integration, etc…). Thoughtbot presented one way to do this recently by adding methods to the model that return another model with the extra goodness.

That’s not bad, but it still pollutes the class with methods that most users won’t care about. We can just decorate the class with extra methods at the time (context) that we need them. My first go at doing this used the extend method:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class PurchaseOrder
  attr_reader :id
end

module Reports::PurchaseOrderMethods
  def description
    "A Purchase Order"
  end
end

class ReportMakerWithExtend
  def self.report_for(po)
    po.extend(Reports::PurchaseOrderMethods)
    "#{po.id}: #{po.description}"
  end
end

This has a few edge case problems though.

  1. It can potentially override methods in our base class. Imagine if PurchaseOrder#description was defined as private, our module would override this defenition resulting in probably breakage.
  2. It is inelegant to test – extend will override any existing stubs, so you need to stub it out. This is unintuitive and may have unintended consequences, for instance if the class is also using extend in a manner that doesn’t interfere with your stubs.
1
2
3
4
5
6
7
8
9
10
11
# Testing extended PurchaseOrder is inelegant
describe 'ReportMakerWithExtend#report_for' do
  it 'returns a line containing both ID and description' do
    po = stub(
      :id          => 1
      :description => "hello",
      :extend      => nil # :(
    )
    ReportMaker.report_for(po).should == "1: hello"
  end
end

Ruby provides another method to achieve what we want in the form of SimpleDelegator. Basically, it passes on any methods not defined on itself to the object specified in the constructor. This way we can wrap another object without fear of interferring with its internals nor our stubs.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
require 'delegate'

class Reports::PurchaseOrder < SimpleDelegator
  def description
    "A Purchase Order"
  end
end

class ReportMaker
  def self.report_for(po)
    po = Reports::PurchaseOrder.new(po)
    "#{po.id}: #{po.description}"
  end
end

Much nicer. Of course, we would have specs for Reports::PurchaseOrder in addition to PurchaseOrder – this split allows us to keep our tests focussed and easy to read. Using delegation to split up your models allows you to separate code into areas where it is most relevant – helping keep both your models and your tests easy to read and maintain.

What's new in Enki - Admin Interface

I’ve just finished up a fairly major over haul of the Enki admin area, finally throwing away the ugly SimpleLog stylings. Features include:

  1. New visual style, heavily inspired by the new Habari Monolith look
  2. New dashboard, with space to add your own data (feedburner subscribers? analytics data?)
  3. Nicer forms (thanks formtastic!)
  4. AJAX goodness for UI snappiness
  5. Undo for item deletion (no more alert boxes!)

Screens:

Enki - Admin dashboard

Enki - Admin posts list

Of course there’s still more I’d like to add (in particular to do with tags), but isn’t that always the case? I think it’s pretty swish – if you’ve already got an install just pull from master, if you think you might like an install, head over to the Enki website.

Testing flash.now with RSpec

flash.now has always been a pain to test. The the traditional rails approach is to use assert_select and find it in your views. This clearly doesn’t work if you want to test your controller in isolation.

Other folks have found work arounds to the problem, including mocking out the flash or monkey patching it.

These solutions feel a bit like using a sledgehammer to me. If you’re going to monkey patch/mock something, you want it to be as discreet as possible so to minimize the chance of the implementation changing underneath you and also to reduce the affect on other areas of your application. Also, why duplicate perfectly good code that is provided elsewhere?

The real problem with testing flash.now is that it gets cleaned up (via #sweep) at the end of the action before you get to test anything. So let’s solve that problem and that problem only: disable sweeping of flash.now:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# spec/spec_helper.rb
module DisableFlashSweeping
  def sweep
  end
end

# A spec
describe BogusController, "handling GET to #index" do
  it "sets flash.now[:message]" do
    @controller.instance_eval { flash.extend(DisableFlashSweeping) }
    get :index
    flash.now[:message].should_not be_nil
  end
end

instance_eval is used to access the flash, since it’s a protected method, and we extend with the minimum possible code to do what we want – blanking out the sweep method. This should not cause problems because sweeping is only relevant across multiple requests, which we shouldn’t be doing in our controller specs.

Classifier gem rubbish for recommending posts

Chatting with Tim today he suggested maybe using Classifier::LSI would be a cool way to offer ‘related posts’ suggestions for a blog.

Not really knowing anything about it, I whipped up a prototype rake task. It creates the index then marshals it to disk because it takes ages to create and it’s not much fun to play with when you have to wait minutes each time. It then presents 3 related suggestions for each post.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
require 'classifier'

namespace :lsi do
  task :test => :environment do
    if File.exists?("lsidata.dump")
      lsi = File.open("lsidata.dump") {|f| Marshal.load(f) }
    else  
      lsi = Classifier::LSI.new
      Post.find(:all, :order => 'published_at DESC').each do |post|
        text = post.body
        categories = post.tags.collect(&:name)
        puts "Indexing " + post.title
        lsi.add_item(text, *categories)
      end
      File.open("lsidata.dump", "w") {|f| Marshal.dump(lsi, f) }
    end

    Post.find(:all).each do |post|
      puts post.title
      puts lsi.find_related(post.body, 3).collect {|i| Post.find_by_body(i).title }.inspect
    end
  end
end

Here’s the data for my last 5 articles. I don’t know what I was expecting, but this just doesn’t seem very helpful. I don’t have a very rich set of tags on my posts, so that probably has something to do with it. Was kind of hoping it would just look at text and all just work * waves hands *.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Seagate 500Gb FreeAgent Pro external drive - first impressions
  - Building Firefox Extensions
  - The Colemak Diaries
  - Counting ActiveRecord associations: count, size or length?
Coconut Oats
  - The Colemak Diaries
  - Summertime Tagliarini
  - Mary Iron Chef - Chocolate Jaffa Boxes
Mary Iron Chef - Chocolate Jaffa Boxes
  - The Colemak Diaries
  - Building Firefox Extensions
  - Summertime Tagliarini
Paypal IPN fails date standards
  - Building Firefox Extensions
  - Straight Sailing with Magellan
  - The Colemak Diaries
I'm number 8!
  - Extending Rails
  - Practical Hpricot: SVG
  - Day of days

Next step is to try tagging my stuff better and seeing if that helps out.

Getting classifier working

Quick side note – pure ruby classifier doesn’t work out of the box with rails because it also redefines Array#sum. If you install the GSL lib and the ruby bindings (see classifier docs) you’ll still need this one line patch to classifier to get it to work:

1
2
3
4
5
6
7
8
9
10
11
12
Index: lib/classifier/lsi.rb
===================================================================
--- lib/classifier/lsi.rb       (revision 31)
+++ lib/classifier/lsi.rb       (working copy)
@@ -25,6 +25,8 @@
   # please consult Wikipedia[http://en.wikipedia.org/wiki/Latent_Semantic_Indexing].
   class LSI
     
+    include GSL if $GSL
+    
     attr_reader :word_list
     attr_accessor :auto_rebuild

UPDATE: I’ve forked classifier on github, so you can just grab that version if you like.

Nginx, OpenID delegation and YADIS

Typically OpenID delegation reads delegation information out of HTML headers on your home page:

1
2
<link rel="openid.server" ref="http://server.myid.net/server" />
<link rel="openid.delegate" href="http://xaviershay.myid.net/" />

The problem with this is that any client trying to discover this information needs to fetch your entire home page. If that client is your page (commenting on your own entry, for instance), that request can get queued up behind the same mongrel that was serving the original request, which of course now won’t complete until the OpenID delegation request times out.

There is another way to provide delegation information. Clients will request your home page with an accept header of application/xrds+xml – and you can use that information to serve up a static YADIS file rather than your home page. Mine looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
<xrds:XRDS xmlns:xrds="xri://$xrds" xmlns="xri://$xrd*($v*2.0)"
      xmlns:openid="http://openid.net/xmlns/1.0">
  <XRD>

    <Service priority="1">
      <Type>http://openid.net/signon/1.0</Type>
      <URI>https://server.myid.net/server</URI>
      <openid:Delegate>https://xaviershay.myid.net/</openid:Delegate>
    </Service>

  </XRD>
</xrds:XRDS>

And I serve it up with this Nginx rewrite rule:

1
2
3
if ($http_accept ~* application/xrds\+xml) {
  rewrite (.*) $1/yadis.xrdf break;
}

Try it in the comfort of your own home:

1
curl -H 'Accept: application/xrds+xml' http://rhnh.net

Ref: OpenID for non-SuperUsers

Powered by Enki

Finally got this blog switched over to Enki. Main feed has moved to feed burner. Please report any weirdness to the relevant authorities.

For some extra content, here’s what’s happening in the Enki world:

  • Moved to github (keeping gitorious as a mirror)
  • Tim has a functional multiple authors fork
  • API is functional if you want to kick the tyres a bit, still needs some work though. Here is some code to publish from VIM

I'm number 8!

I had no idea Working With Rails ran a monthly hackfest. Basically, you contribute to rails, you get points, at the end of the month you can win stuff. To my surprise, I came in at #8 last month and got a free copy of “Make” magazine from O’Reilly.

Sweet. Thank you doc patches.

Obligatory thumbs-up-with-swag photo:

Working With Rails Hackfest Prize

Paypal IPN fails date standards

Paypal Instant Payment Notification lets you know when you have received a paypal payment. Presumably, you then mark an order as paid or something. Do not use the current time as the paid_at date – despite the ‘instant’ in the title it can be many days later. You should use the payment_date provided by paypal. Your accountant will thank you.

But here’s the rub. From the IPN spec, payment_date is:

Time/Date stamp generated by PayPal system [format: “18:30:30 Jan 1, 2000 PST”]

Seen that date format before? No? Didn’t think so. That’s no RFC I’ve seen before. The popular Paypal gem uses Time.parse, but this is incorrect (as of 2.0.0). Observe:

1
2
3
4
>> Time.parse("18:30:30 Mar 28, 2008 PST")
=> Fri Mar 28 18:30:30 1100 2008 # Good
>> Time.parse("18:30:30 Feb 28, 2008 PST")
=> Fri Mar 28 18:30:30 1100 2008 # FAIL

Also, Time only has a range of about a week, so that could screw you over come any major system failures (either you or paypal). Also note the payment_date is in PST, which unless you’re on the right side of the US is fairly useless. I recommend the following:

1
2
>> DateTime.strptime("18:30:30 Jan 1, 2000 PST", "%H:%M:%S %b %e, %Y %Z").new_offset(0)
=> Sun, 02 Jan 2000 02:30:30 0000

The un-intuitive new_offset converts to UTC. Patch submitted. I hate you, Paypal.

Mary Iron Chef - Chocolate Jaffa Boxes

Mary at Kenneth Falls The picturesque Otways served an inspiring back drop to the inaugural Mary Iron Chef Challenge. Tension was high – I had teamed up with the renowned dessert specialist Amelia Ie, pitted against the young superstar couple Yujin and Katie (photo). Chairman Tim flamboyantly revealed the challenge ingredient – Chocolate! – and with a bang of the saucepan lid gong started the 90 minute Timer Of Impending Dessert.

Amelia and I made 3 dishes for this challenge. Our crowning achievement were the Chocolate Jaffa Boxes. As a judge gushed – ‘the rich velvet couverture of the enclosure frolics playfully with the airy mousse, while the mango reminds me of the playful delights of summer’. Accept that translation at your own risk.

Chocolate Jaffa Boxes

Makes 8

Ingredients

  • 500g dark chocolate, melted
  • 250g milk chocolate, melted
  • 1 packed orange jelly crystals
  • Generous splash of brandy
  • 500ml thickened cream
  • 1 Mango

Method

  1. Spread dark chocolate thinly over 2 trays covered in foil, saving a small amount for later. Refrigerate until solid – this will become the boxes.
  2. Whisk cream until fluffy (use electric beaters)
  3. Mix together brandy and jelly crystals, then dissolve crystals in microwave (takes about a minute). Inhale fumes deeply.
  4. Add jelly mix to milk chocolate, then fold in half of the cream. You fold rather than stir because it helps keep the mixture aerated.
  5. This bit takes some geometric nouse – take the solid dark chocolate out of the fridge and with a sharp knife divide each tray into 40 portions – groups of 5 will be used to make each box. A diagram here would be nice but I don’t have the tools. The base portion can be bigger than the other 4, as long as they all come from the same strip so that they have the same edge length. Take your time with this step because you don’t want to shatter any of the pieces.
  6. Assemble each group of 5 portions into a box, using the left over melted chocolate to stick them together. Lookout, here comes some math: 40×2 / 5 = 8 boxes.
  7. Spoon chocolate mix in to each box, then add a dollop of cream to each
  8. Slice up the mango and arrange it NICELY on the top of each box
  9. Refrigerate until the chocolate mix sets (we didn’t do this because we only had 90 minutes, but the ones we left overnight were much tastier)

This challenge was a lot of fun. We got to wear funny hats. Special thanks to Amelia, without whose kitchen mastery I would have probably just served chocolate pieces in a bowl.

Iron Chef - Chocolate Jaffa Box

Apologies for the absence of tech posts lately, that’s just how life is at the moment. Hopefully have something geekier to write about soon.

Coconut Oats

A more appropriate name may be “Ghetto Dessert #1”. Once again, I neglected the supermarket and tried cooking with whatever was in the cupboards.

Coconut Oats

Serves 1-2

Ingredients

  • 1 bowl of oats
  • 1/2 can coconut milk
  • Caster sugar + maple Syrup OR brown sugar + cocoa

Method

  1. Soak oats in coconut milk until it is absorbed (longer is better, I left mine for about 90 minutes)
  2. Mix in your choice of condiments

I experimented with a few different sweeteners – the four listed above individually and also honey. Honey didn’t work so well, but the 2 combinations above I think were winners. Adding fruit to the maple syrup variant would be particularly tasty, but we’re never home enough to have fruit on hand. I’m going to try turning the chocolate one into porridge by warming it in a saucepan. I have another bowl sitting in the fridge that I’m going to leave overnight a la bircher muesli to see just how much the oats can absorb.

And for bonus points it’s vegan – a rare property of my desserts.

UPDATE: Leaving overnight is highly recommended. A tasty non-vegan option is mixing in nutella.

UPDATE 2: Mix in castor sugar, dried apricot and cranberries, then serve with shredded coconut and flaked almonds. This is the best one.

Seagate 500Gb FreeAgent Pro external drive - first impressions

It has a stupid name. The title is the first and last time I will refer to it as anything other than a “Seagate 500Gb external drive”. What is not stupid is the packaging. It’s clear, concise, fun, and most importantly makes me feel like Seagate actually cares about the people who use its products. Observe the following shots of the static packaging and the instruction booklet:

Packaging

Instructions #1

Instructions #2

Text on the last frame says: “Note: Times may vary depending on how excited you are about using your new FreeAgent Pro data mover.” Delicious.

I had to format it as FAT32 because as far as I can tell OSX doesn’t support writing to NTFS volumes. This makes me sad. I presume linux can write to Mac’s filesystem, but AFAIK windows can’t, which unfortunately I need to support because that’s what all my family use :( No fault of the drive here, just another windows gripe. Although linux has had NTFS write support stable for a while now, I wouldn’t mind Mac catching up.

It is much quieter than I expected. It’s under full load right now – I’m rsyncing to it.

5 year warranty, so I guess they have confidence in the product.

Initial impression is positive, ask me again when I actually have to restore from it.

Unrelated footnote: Technically I’m back from my holiday, but I’m snowed under with dancing commitments for now so coding updates (and enki updates) will still be sporadic.

UPDATE Just reformatted for Time Machine, YAGNIed the work-with-family requirement.

A pretty flower Another pretty flower