Robot Has No Heart

Xavier Shay blogs here

A robot that does not have a heart

How I Test Rails Applications

The Rails conventions for testing provide three categories for your tests:

  • Unit. What you write to test your models.
  • Integration. Used to test the interaction among any number of controllers.
  • Functional. Testing the various actions of a single controller.

This tells you where to put your tests, but the type of testing you perform on each part of the system is the same: load fixtures into the database to get the app into the required state, run some part of the system either directly (models) or using provided harnesses (controllers), then verify the expected output.

This techinque is simple, but is only one of a number of ways of testing. As your application grows, you will need to add other approaches to your toolbelt to enable your test suite to continue providing valuable feedback not just on the correctness of your code, but its design as well.

I use a different set of categories for my tests (taken from the GOOS book):

  • Unit. Do our objects do the right thing, and are they convenient to work with?
  • Integration. Does our code work against code we can’t change?
  • Acceptance. Does the whole system work?

Note that these definitions of unit and integration are radically different to how Rails defines them. That is unfortunate, but these definitions are more commonly accepted across other languages and frameworks and I prefer to use them since it facilitates an exchange of information across them. All of the typical Rails tests fall under the “integration” label, leaving two new levels of testing to talk about: unit and acceptance.

Unit Tests

“A test is not a unit test if it talks to the database, communicates across a network, or touches the file system.” – Working with Legacy Code, p. 14

This type of test is typically referred to in the Rails community as a “fast unit test”, which is unfortunate since speed is far from the primary benefit. The primary benefit of unit testing is the feedback it provides on the dependencies in your design. “Design unit tests” would be a better label.

This feedback is absolutely critical in any non-trivial application. Unchecked dependency is crippling, and Rails encourages you not to think about it (most obviously by implicitly autoloading everything).

By unit testing a class you are forced to think about how it interacts with other classes, which leads to simpler dependency trees and simpler programs.

Unit tests tend to (though don’t always have to) make use of mocking to verify interactions between classes. Using rspec-fire is absolutely critical when doing this. It verifies your mocks represent actual objects with no extra effort required in your tests, bridging the gap to statically-typed mocks in languages like Java.

As a guideline, a single unit test shouldn’t take more than 1ms to run.

Acceptance Tests

A Rails integration test doesn’t exercise the entire system, since it uses a harness and doesn’t use the system from the perspective of a user. As one example, you need to post form parameters directly rather than actually filling out the form, making the test both brittle in that if you change your HTML form the test will still pass, and incomplete in that it doesn’t actually load the page up in a browser and verify that Javascript and CSS are not intefering with the submission of the form.

Full system testing was popularized by the cucumber library, but cucumber adds a level of indirection that isn’t useful for most applications. Unless you are actually collaborating with non-technical stakeholders, the extra complexity just gets in your way. RSpec can easily be written in a BDD style without extra libraries.

Theoretically you should only be interacting with the system as a black box, which means no creating fixture data or otherwise messing with the internals of the system in order to set it up correctly. In practice, this tends to be unweildy but I still maintain a strict abstraction so that tests read like black box tests, hiding any internal modification behind an interface that could be implemented by black box interactions, but is “optimized” to use internal knowledge. I’ve had success with the builder pattern, also presented in the GOOS book, but that’s another blog post (i.e. build_registration.with_hosting_request.create).

A common anti-pattern is to try and use transactional fixtures in acceptance tests. Don’t do this. It isn’t executing the full system (so can’t test transaction level functionality) and is prone to flakiness.

An acceptance test will typically take seconds to run, and should only be used for happy-path verification of behaviour. It makes sure that all the pieces hang together correctly. Edge case testing should be done at the unit or integration level. Ideally each new feature should have only one or two acceptance tests.

File Organisation.

I use spec/{unit,integration,acceptance} folders as the parent of all specs. Each type of spec has it’s own helper require, so unit specs require unit_helper rather than spec_helper. Each of those helpers will then require other helpers as appropriate, for instance my rails_helper looks like this (note the hack required to support this layout):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
ENV["RAILS_ENV"] ||= 'test'
require File.expand_path("../../config/environment", __FILE__)

# By default, rspec/rails tags all specs in spec/integration as request specs,
# which is not what we want. There does not appear to be a way to disable this
# behaviour, so below is a copy of rspec/rails.rb with this default behaviour
# commented out.
require 'rspec/core'

RSpec::configure do |c|
  c.backtrace_clean_patterns << /vendor\//
  c.backtrace_clean_patterns << /lib\/rspec\/rails/
end

require 'rspec/rails/extensions'
require 'rspec/rails/view_rendering'
require 'rspec/rails/adapters'
require 'rspec/rails/matchers'
require 'rspec/rails/fixture_support'
require 'rspec/rails/mocks'
require 'rspec/rails/module_inclusion'
# require 'rspec/rails/example' # Commented this out
require 'rspec/rails/vendor/capybara'
require 'rspec/rails/vendor/webrat'

# Added the below, we still want access to some of the example groups
require 'rspec/rails/example/rails_example_group'
require 'rspec/rails/example/controller_example_group'
require 'rspec/rails/example/helper_example_group'

Controllers specs go in spec/integration/controllers, though I’m trending towards using poniard that allows me to test controllers in isolation (spec/unit/controllers).

Helpers are either unit or integration tested depending on the type of work they are doing. If it is domain level logic it can be unit tested (though I tend to use presenters for this, which are also unit tested), but for helpers that layer on top of Rails provided helpers (like link_to or content_tag) they should be integration tested to verify they are using the library in the correct way.

I have used this approach on a number of Rails applications over the last 1-2 years and found it leads to better and more enjoyable code.

Blocking (synchronous) calls in Goliath

Posting for my future self. A generic function to run blocking code in a deferred thread and resume the fiber on completion, so as not to block the reactor loop.

1
2
3
4
5
6
7
8
9
10
def blocking(&f)
  fiber = Fiber.current
  result = nil
  EM.defer(f, ->(x){
    result = x
    fiber.resume
  })
  Fiber.yield
  result
end

Usage

1
2
3
4
5
6
class MyServer < Goliath::API
  def response(env)
    blocking { sleep 1 }
    [200, {}, 'Woken up']
  end
end

Form Objects in Rails

For a while now I have been using form objects instead of nested attributes for complex forms, and the experience has been pleasant. A form object is an object designed explicitly to back a given form. It handles validation, defaults, casting, and translation of attributes to the persistence layer. A basic example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
class Form::NewRegistration
  include ActiveModel::Validations

  def self.scalar_attributes
    [:name, :age]
  end

  attr_accessor *scalar_attributes
  attr_reader :event

  validates_presence_of :name

  def initialize(event, params = {})
    self.class.scalar_attributes.each do |attr|
      self.send("%s=" % attr, params[attr]) if params.has_key?(attr)
    end
  end

  def create
    return unless valid?

    registration = Registration.create!(
      event: event,
      data_json: {
        name: name,
        age:  age.to_i,
      }.to_json
    )

    registration
  end

  # ActiveModel support
  def self.name; "Registration"; end
  def persisted?; false; end
  def to_key; nil; end
end

Note how this allows an easy mapping from form fields to a serialized JSON blob.

I have found this more explicit and flexible than tying forms directly to nested attributes. It allows more fine tuned control of the form behaviour, is easier to reason about and test, and enables you to refactor your data model with minimal other changes. (In fact, if you are planning on refactoring your data model, adding in a form object as a “shim” to protect other parts of the system from change before you refactor is usually desirable.) It even works well with nested attributes, using the form object to build up the required nested hash in the #create method.

Relationships

A benefit of this approach, albeit still a little clunky, is having accessors map one to one with form fields even for one to many associations. My approach takes advantages of Ruby’s flexible object model to define accessors on the fly. For example, say a registration has multiple custom answer fields, as defined on the event, I would call the following method on initialisation:

1
2
3
4
5
6
7
8
9
def add_answer_accessors!
  event.questions.each do |q|
    attr = :"answer_#{q.id}"
    instance_eval <<-RUBY
      def #{attr};     answers[#{q.id}]; end
      def #{attr}=(x); answers[#{q.id}] = x; end
    RUBY
  end
end

With the exception of the above code (which isn’t too bad), this greatly simplifies typical code for handling one to many relationships: it avoids fields_for, index, and is easier to set up sane defaults for.

Casting

I use a small supporting module to handle casting of attributes to certain types.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
module TypedWriter
  def typed_writer(type, attribute)
    class_eval <<-EOS
      def #{attribute}=(x)
        @#{attribute} = type_cast(x, :#{type})
      end
    EOS
  end

  def type_cast(x, type)
    case type
    when :integer
      x.to_s.length > 0 ? x.to_i : nil
    when :boolean
      x.to_s.length > 0 ? x == true || x == "true" : nil
    when :boolean_with_nil
      if x.to_s == 'on' || x.nil?
        nil
      else
        x.to_s.length > 0 ? x == true || x == "true" : nil
      end
    when :int_array
      [*x].map(&:to_i).select {|x| x > 0 }
    else
      raise "Unknown type #{type}"
    end
  end

  def self.included(klass)
    # Make methods available both as class and instance methods.
    klass.extend(self)
  end
end

It is used like so:

1
2
3
4
5
6
7
class Form::NewRegistration
  # ...

  include TypedWriter

  typed_writer :age, :integer
end

Testing

I don’t load Rails for my form tests, so an explicit require of active model is necessary. I do this in my form code since I like explicitly requiring third-party dependencies everywhere they are used.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
require 'unit_helper'

require 'form/new_registration'

describe Form::NewRegistration do
  include RSpec::Fire

  let(:event) { fire_double('Event') }

  subject { described_class.new(event) }

  def valid_attributes
    {
      name: 'don',
      age:  25
    }
  end

  def form(extra = {})
    described_class.new(event, valid_attributes.merge(extra))
  end

  describe 'validations' do
    it 'is valid for default attributes' do
      form.should be_valid
    end

    it { form(name: '').should have_error_on(:name) }
  end

  describe 'type-casting' do
    let(:f) { form } # Memoize the form

    # This pattern is overkill in this example, but useful when you have many
    # typed attributes.
    let(:typecasts) {{
      int: {
        nil  => nil,
        ""   => nil,
        23   => 23,
        "23" => 23,
      }
    }}

    it 'casts age to an int' do
      typecasts[:int].each do |value, expected|
        f.age = value
        f.age.should == expected
      end
    end
  end

  describe '#create' do
    it 'returns false when not valid' do
      subject.create.should_not be
    end

    it 'creates a new registration' do
      f = form
      dao = fire_replaced_class_double("Registration")
      dao.should_receive(:create).with {|x|
        x[:event].should == event

        data = JSON.parse(x[:data_json])

        data['name'].should == valid_attributes[:name]
        data['age'].should == valid_attributes[:age]
      }
      f.create.should new_rego
    end
  end

  it { should_not be_persisted }
end

Code Sharing

I tend to have a parent object Form::Registration, with subclasses for Form::{New,Update,View}Registration. A common mixin would also work. For testing, I use a shared spec that is run by the specs for each of the three subclasses.

Conclusion

There are other solutions to this problem (such as separating validations completely) which I haven’t tried yet, and I haven’t used this approach on a team yet. It has worked well for my solo projects though, and I’m just about confident enough to recommend it for production use.

Poniard: a Dependency Injector for Rails

I just open sourced poniard, a dependency injector for Rails. It’s a newer version of code I posted a few weeks back that allows you to write controllers using plain ruby objects:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
module Controller
  class Registration
    def update(response, now_flash, update_form)
      form = update_form

      if form.save
        response.respond_with SuccessfulUpdateResponse, form
      else
        now_flash[:message] = "Could not save registration."
        response.render action: 'edit', ivars: {registration: form}
      end
    end

    SuccessfulUpdateResponse = Struct.new(:form) do
      def html(response, flash, current_event)
        flash[:message] = "Updated details for %s" % form.name
        response.redirect_to :registrations, current_event
      end

      def js(response)
        response.render json: form
      end
    end
  end
end

This makes it possible to test them in isolation, leading to a better appreciation of your dependencies and nicer code.

Check it out!

Guice in your JRuby

At work we have a Java application container that uses Google Guice for dependency injection. I thought it would be fun to try and embed some Ruby code into it.

Guice uses types and annotations to wire components together, neither of which Ruby has. It also uses Java meta-class information heavily (SomeClass.class). High hurdles, but we can clear them.

Warming Up

Normally JRuby is used to interpret Ruby code inside a Java environment, but it also provides functionality to compile a Ruby class to a Java one. In essence, it creates a Java wrapper class that delegates all calls to Ruby. Let’s look at a simple example.

1
2
3
4
5
6
# SayHello.rb
class SayHello
  def hello(name)
    puts "Hello #{name}"
  end
end

Compile using the jrubyc script. By default it compiles directly to a .class file, but it doesn’t work correctly at the moment. Besides, going to Java first allows us to see what is going on.

1
jrubyc --java SayHello.rb

The compiled Java is refreshingly easy to understand. It even has comments!

Imports are redacted from all Java examples for brevity.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
// SayHello.java
public class SayHello extends RubyObject  {
    private static final Ruby __ruby__ = Ruby.getGlobalRuntime();
    private static final RubyClass __metaclass__;

    static {
        String source = new StringBuilder("class SayHello\n" +
            "  def hello(name)\n" +
            "    puts \"Hello #{name}\"\n" +
            "  end\n" +
            "end\n" +
            "").toString();
        __ruby__.executeScript(source, "SayHello.rb");
        RubyClass metaclass = __ruby__.getClass("SayHello");
        metaclass.setRubyStaticAllocator(SayHello.class);
        if (metaclass == null) throw new NoClassDefFoundError("Could not load Ruby class: SayHello");
        __metaclass__ = metaclass;
    }

    /**
     * Standard Ruby object constructor, for construction-from-Ruby purposes.
     * Generally not for user consumption.
     *
     * @param ruby The JRuby instance this object will belong to
     * @param metaclass The RubyClass representing the Ruby class of this object
     */
    private SayHello(Ruby ruby, RubyClass metaclass) {
        super(ruby, metaclass);
    }

    /**
     * A static method used by JRuby for allocating instances of this object
     * from Ruby. Generally not for user comsumption.
     *
     * @param ruby The JRuby instance this object will belong to
     * @param metaclass The RubyClass representing the Ruby class of this object
     */
    public static IRubyObject __allocate__(Ruby ruby, RubyClass metaClass) {
        return new SayHello(ruby, metaClass);
    }

    /**
     * Default constructor. Invokes this(Ruby, RubyClass) with the classloader-static
     * Ruby and RubyClass instances assocated with this class, and then invokes the
     * no-argument 'initialize' method in Ruby.
     *
     * @param ruby The JRuby instance this object will belong to
     * @param metaclass The RubyClass representing the Ruby class of this object
     */
    public SayHello() {
        this(__ruby__, __metaclass__);
        RuntimeHelpers.invoke(__ruby__.getCurrentContext(), this, "initialize");
    }

    public Object hello(Object name) {
        IRubyObject ruby_name = JavaUtil.convertJavaToRuby(__ruby__, name);
        IRubyObject ruby_result = RuntimeHelpers.invoke(__ruby__.getCurrentContext(), this, "hello", ruby_name);
        return (Object)ruby_result.toJava(Object.class);
    }
}

Simple: A Java class with concrete type and method definitions, delegating each method to Ruby. For the next step, JRuby supports metadata provided in Ruby to control the exact types and annotations that are used in the generated code.

1
2
3
4
5
6
7
# SayHello.rb
class SayHello
  java_signature 'void hello(String)'
  def hello(name)
    puts "Hello #{name}"
  end
end
1
2
3
4
5
public void hello(String name) {
    IRubyObject ruby_name = JavaUtil.convertJavaToRuby(__ruby__, name);
    IRubyObject ruby_result = RuntimeHelpers.invoke(__ruby__.getCurrentContext(), this, "hello", ruby_name);
    return;
}

Perfect! Now we have all the pieces we need to start wiring our Ruby into Guice.

Guice

Let’s start by injecting an object that our Ruby class can use to do something interesting.

1
2
3
4
5
6
7
public class JrubyGuiceExample {
  public static void main(String[] args) {
    Injector injector = Guice.createInjector();
    SimplestApp app = injector.getInstance(SimplestApp.class);
    app.run();
  }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
require 'java'

java_package 'net.rhnh'

java_import 'com.google.inject.Inject'

class SimplestApp
  java_annotation 'Inject'
  java_signature 'void MyApp(BareLogger logger)'
  def initialize(logger)
    @logger = logger
  end

  def run
    @logger.info("Hello from Ruby")
  end
end

Guice will see the BareLogger type, and automatically create an instance of that class to be passed to the initializer.

Guice also allows more complex dependency graphs, such as knowing which concrete class to provide for an interface. These are declared using a module, which — though probably not a good idea — we can also write in ruby. The following example tells Guice to provide an instance of PrefixLogger whenever an interface of SimpleLogger is asked for.

1
2
3
4
5
6
7
public class JrubyGuiceExample {
  public static void main(String[] args) {
    Injector injector = Guice.createInjector(new ComplexModule());
    ComplexApp app = injector.getInstance(ComplexApp.class);
    app.run();
  }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
require 'java'

java_package 'net.rhnh'

java_import 'com.google.inject.Provides'
java_import 'com.google.inject.Binder'

class ComplexModule
  java_implements 'com.google.inject.Module'

  java_signature 'void configure(Binder binder)'
  def configure(binder)
    binder.
      bind(java::SimpleLogger.java_class).
      to(java::PrefixLogger.java_class)
  end

  protected

  def java
    Java::net.rhnh
  end
end

You can also provide more complex setup logic in dedicated methods with the Provides annotation. See the example project linked at the bottom of the post.

Maven integration

Running jrubyc all the time is a drag. Thankfully, someone has already made a maven plugin that puts everything in the right place.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
<plugin>
  <groupId>de.saumya.mojo</groupId>
  <artifactId>jruby-maven-plugin</artifactId>
  <version>0.29.1</version>
  <configuration>
    <generateJava>true</generateJava>
    <generatedJavaDirectory>target/generated-sources/jruby</generatedJavaDirectory>
  </configuration>
  <executions>
    <execution>
      <phase>process-resources</phase>
      <goals>
        <goal>compile</goal>
      </goals>
    </execution>
  </executions>
</plugin>

Now running mvn package will compile Ruby code from src/main/ruby to java code in target, which is then available for the main Java build to compile.

For more examples and runnable code, see the jruby-guice project on GitHub.

Benchmarking RSpec double versus OpenStruct

I noticed a number of my unit tests were taking upwards of 10ms, an order of magnitude slower than they should be. Turns out I was abusing rspec doubles, in particular I was using one instead of a value object. Doubles are far slower than plain Ruby objects, in particular as the number of attributes goes up. It looks linear, but the constant factor is bad. The following benchmark demonstrates using a double versus an OpenStruct, which can often be used as a drop in replacement. (Normally I just use the value object itself, but it this case it was an ActiveRecord subclass.)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
require 'ostruct'

describe 'benchmark' do
  let(:attributes) {
    ENV['N'].to_i.times.each_with_object({}) {|x, h| h["attr_#{x}"] = 'hello' }
  }

  5.times do
    it 'measures doubles' do
      double(attributes)
    end

    it 'measures structs' do
      OpenStruct.new(attributes)
    end
  end
end

Only 6-8 attributes before the 1ms barrier is broken, and this is only for construction!

To graph it, I threw out the first result for each measurement, since it tended to be all over the shop during warm up. The following script is a hack that relies on a priori knowledge that double is slower, since it doesn’t try to match rspec profile out measurements to label. The measurements are so different in this case that it works.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
> for N in {1..20}; do env N=$N rspec benchmark_spec.rb -p | \
  grep seconds | \
  grep benchmark_spec | \
  awk '{print $1}' | \
  xargs echo $N; done > results.dat

> gnuplot << eor
set terminal jpeg size 600,200 font "arial,9"
set key left
set output 'graph.jpg'
set datafile separator " "
set xlabel '# of attributes'
set ylabel 'construction time (s)'
plot 'results.dat' u 1:( (\$3+\$4+\$5+\$6)/4) with lines title 'Double', \
       '' u 1:( (\$8+\$9+\$10+\$11) / 4) with lines title 'Struct'
eor

My next project: what is the best way to get the elevated guarantees provided by rspec-fire without taking the speed hit?

Testing Stripe OAuth Connect with Capybara and Selenium

Stripe only allows you to set a fixed redirect URL in your test OAuth settings. This is problematic because you need to redirect to a different host and port depending on whether you are in development or test mode. In other words, there is a global callback that needs to be routed correctly to local callbacks.

My workaround is to use a simple rack application that redirects any incoming requests to the selected host and port. The Capybara host and port is written out to a file on spec start, and if that isn’t present it assumes development. It is clearly a hack, but works fairly well until Stripe provides a better way to do it.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# stripe.ru
run lambda {|env|
  req = Rack::Request.new(env)

  server_file = "/tmp/capybara_server"
  host_and_port = if File.exists?(server_file)
    File.read(server_file)
  else
    "localhost:3000"
  end

  response = Rack::Response.new(env)
  url = "http://#{host_and_port}"
  url << "#{req.path}"
  url << "?#{req.query_string}" unless req.query_string.empty?

  response.redirect(url)
  response.finish
}
1
2
3
4
5
6
7
8
9
10
11
12
13
# spec/acceptance_helper.rb
SERVER_FILE = "/tmp/capybara_server"

Capybara.server {|app, port|
  File.open(SERVER_FILE, "w") {|f| f.write("%s:%i" % ["127.0.0.1", port]) }
  Capybara.run_default_server(app, port)
}

RSpec.configure do |config|
  config.after :suite do
    FileUtils.rm(SERVER_FILE) if File.exists?(SERVER_FILE)
  end
end

This requires the rack application to be running already (much like the database is expected to be running), which can be done thusly:

1
bundle exec rackup --port 3001 stripe.ru

Set your Stripe callback to http://localhost:3001/your/callback.

Dependency Injection for Rails Controllers

What if controllers looked like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
module Controller
  class Registration
    def update(response, now_flash, update_form)
      form = update_form

      if form.save
        response.respond_with SuccessfulUpdateResponse, form
      else
        now_flash[:message] = "Could not save registration."
        response.render action: 'edit', ivars: {registration: form}
      end
    end

    SuccessfulUpdateResponse = Struct.new(:form) do
      def html(response, flash, current_event)
        flash[:message] = "Updated details for %s" % form.name
        response.redirect_to :registrations, current_event
      end

      def js(response)
        response.render json: form
      end
    end
  end
end

It is a plain ruby object that receives all needed dependencies via method arguments. (Requires Some Magic, explained below.) This is a style of dependency injection inspired by Raptor, Dropwizard and Guice. It allows you to cleanly separate authorization, object fetching, control flow, and other typical controller responsibilities, and as a result is much easier to organise and test than the traditional style.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
require 'unit_helper'

require 'injector'
require 'controller/registration'

describe Controller::Registration do
  success_response = Controller::Registration::SuccessfulUpdateResponse

  let(:form)      { fire_double("Form::UpdateRegistration") }
  let(:response)  { fire_double("ControllerSource::Response") }
  let(:event)     { fire_double("Event") }
  let(:flash)     { {} }
  let(:now_flash) { {} }
  let(:injector)  { Injector.new([OpenStruct.new(
    response:      response.as_null_object,
    current_event: event.as_null_object,
    update_form:   form.as_null_object,
    flash:         flash,
    now_flash:     now_flash
  )]) }

  describe '#update' do
    it 'saves form and responds with successful update' do
      form.should_receive(:save).and_return(true)
      response
        .should_receive(:respond_with)
        .with(success_response, form)

      injector.dispatch described_class.new.method(:update)
    end

    it 'render edit page when save fails' do
      form.should_receive(:save).and_return(false)
      response
        .should_receive(:render)
        .with(action: 'edit', ivars: {registration: form})

      injector.dispatch described_class.new.method(:update)

      now_flash[:message].length.should > 0
    end
  end

  describe success_response do
    describe '#html' do
      it 'redirects to registration' do
        response.should_receive(:redirect_to).with(:registrations, event)

        injector.dispatch success_response.new(form).method(:html)
      end

      it 'includes name in flash message' do
        form.stub(:name).and_return("Don")

        injector.dispatch success_response.new(form).method(:html)

        flash[:message].should include(form.name)
      end
    end
  end
end

Before filters and authorization can be extracted out into a separate source, and will be applied when they are named in a method. For instance, if you specify current_event as a method argument in Controller::Registration#update, you will receive Controller::RegistrationSource#current_event. Authorization is interesting: requesting authorized_organiser when not authorized will cause and UnauthorizedException, which you can handle in your base ApplicationController (note: the above example omits authorization).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
module Controller
  class RegistrationSource
    def current_event(params)
      Event.find(params[:event_id])
    end

    def current_registration(params, current_event)
      current_event.registrations.find(params[:id])
    end

    def current_organiser(session)
      Organiser.find_by_id(session[:organiser_id])
    end

    def authorized_organiser(current_event, current_organiser)
      unless current_organiser && current_organiser.can_edit?(current_event)
        raise UnauthorizedException
      end
    end

    def update_form(params, current_registration)
      Form::UpdateRegistration.build(
        current_registration,
        params[:registration]
      )
    end
  end
end

Magic wiring

An Injector is responsible for introspecting method arguments and finding an appropriate object from its sources to inject. In the controller case two sources are required: one for standard controller dependencies (params, flash, etc), and one for application specific logic (the RegistrationSource seen above).

1
2
3
4
5
6
7
8
9
class RegistrationsController < ApplicationController
  def update
    injector = Injector.new([
      ControllerSource.new(self),
      Controller::RegistrationSource.new
    ])
    injector.dispatch Controller::Registration.new.method(:update)
  end
end

The injector itself is fairly straightforward. The tricky part is the recursive dispatch, which enables sources to themselves request dependency injection, allowing the type of decomposition seen in registration_source where authorized_organiser depends on the definition of current_organiser in the same class.

UnknownInjectable is a cute trick for testing: you don’t need to specify every dependency requested by the method, only the ones that are being used by the code path being executed. In non-test code it probably makes sense to raise an exception earlier.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
class Injector
  attr_reader :sources

  def initialize(sources)
    @sources = sources + [self]
  end

  def dispatch(method, overrides = {})
    args = method.parameters.map {|_, name|
      source = sources.detect {|source| source.respond_to?(name) }
      if source
        dispatch(source.method(name), overrides)
      else
        UnknownInjectable.new(name)
      end
    }
    method.call(*args)
  end

  def injector
    self
  end

  class UnknownInjectable < BasicObject
    def initialize(name)
      @name = name
    end

    def method_missing(*args)
      ::Kernel.raise "Tried to call method on an uninjected param: #{@name}"
    end
  end
end

Finally for completeness, an implementation of ControllerSource:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
class ControllerSource
  Response = Struct.new(:controller, :injector) do
    def redirect_to(path, *args)
      controller.redirect_to(controller.send("#{path}_path", *args))
    end

    def render(*args)
      ivars = {}
      if args.last.is_a?(Hash) && args.last.has_key?(:ivars)
        ivars = args.last.delete(:ivars)
      end

      ivars.each do |name, val|
        controller.instance_variable_set("@#{name}", val)
      end

      controller.render *args
    end

    def respond_with(klass, *args)
      obj = klass.new(*args)
      format = controller.request.format.symbol
      if obj.respond_to?(format)
        injector.dispatch obj.method(format)
      end
    end
  end

  def initialize(controller)
    @controller = controller
  end

  def params;    @controller.params; end
  def session;   @controller.session; end
  def flash;     @controller.flash; end
  def now_flash; @controller.flash.now; end

  def response(injector)
    Response.new(@controller, injector)
  end
end

Initial impressions are that it does feel like more magic until you get in the groove, after which it is no more so than normal Rails. I remember my epiphany when writing Guice code—“oh you just name a thing and you get it!”—after which the ride became a lot smoother. I really like the better testability of controllers, since that has always been a pain point of mine. I’m going to experiment some more on larger chunks of code, try and nail down the naming conventions some more.

Disclaimer: I haven’t use this ideal in any substantial form, beyond one controller action from a project I have lying around. It remains to be seen whether it is a good idea or not.

All code as a gist.

Automatically backup Zoho Calendar, Google Calendar

Quick script I put together to automatically back up all of Jodie’s calendars for her.

Works for any online calendar that exposes an iCal link. You’ll need to replace “http://icalurl” in the script with the private iCal URL of your calendar. In Zoho, this is under Settings > My Calendars > Share > Enable private Address for this calendar.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
require 'date'
require 'fileutils'

calendars = {
  'My Calendar'    => 'http://icalurl',
  'Other Calendar' => 'http://icalurl'
}

folder = Date.today.to_s

FileUtils.mkdir_p(folder)

calendars.each do |name, url|
  puts %|Backing up "#{name}"...|
  `curl -s "#{url}" > "#{folder}/#{name}.ics"`
end
puts "Done!"

Stores a folder per day. For bonus points, put it straight into Dropbox.

Screencast: moving to Heroku

A treat from the archives! I found a screen recording with commentary of me moving this crusty old blog from a VPS on to Heroku from about a year ago. It’s still pretty relevant, not just technology wise but also how I work (except I wasn’t using tmux then).

This is one take with no rehersal, preparation or editing, so you get my development and thought process raw. All two and a half hours of it. That has positives and negatives. I don’t know how interesting this is to others, but putting it out there in case. Make sure you watch them in a viewer that can speed up the video.

An interesting observation I noted was that I tend to have two tasks going in parallel most of the time to context switch between when I’m blocked on one waiting for a gem install or the like.

I have divided it into four parts, each around 40 minutes long and 350mb in size.

  • Part 1 gets the specs running, green, fixes deprecations, moves from 1.8 to 1.9.
  • Part 2 moves from MySQL to Postgres, replaces sphinx with full text search.
  • Part 3 continues the sphinx to postgres transition, implementing related posts
  • Part 4 deploys the finished product to heroku, copies data across, and gets exception notification working.

Rough indexes are provided below.

Part 1

0:00 Introduction
0:50 rake, bundle
1:42 Search for MySQL to PG conversion, maybe taps gem?
3:22 bundle finishes
3:42 couldn’t parse YAML file, switch to 1.8.7 for now
4:10 Add .rvmrc
4:39 bundle again for 1.8.7
4:50 Search for Heroku cedar stack docs (back when it was new), reading
6:30 Gherkin fails to build
8:50 Can’t find solution, update gherkin to latest
9:10 Find YAML fix while waiting for gherkin to update
10:08 Cancel gherkin update, switch to 1.9.2 and apply YAML fix
10:20 AWS S3 gem not 1.9 compatible, but not needed anymore so delete
11:10 Remove db2s3 gem also
11:20 nil.[] error, non-obvious
11:50 Missing test db config
12:20 Tests are running, failures
12:50 Debug missing partial error, start local server to click around and it works here
14:15 Back to fixing specs
14:25 Removed functionality but not specs, clearly haven’t been running specs regularly. Poor form.
15:45 Target specs passing
16:13 Fix a deprecation warning along the way
16:40 Commit fixes for 1.9.2
17:50 While waiting for specs, check for sphinx code
18:05 author_ip can’t be null, why is that still there?
18:50 make it nullable, don’t want to delete old data right now
19:40 Search for MySQL syntax
21:06 Oh actually author_ip does get set, specs actually are broken
22:07 Add blank values to spec, fixes spec.
22:39 Add blank values in again, would be nice to extract duplicate code
23:35 Start fixing tagging
24:30 Why no backtraces? Argh color scheme hiding them, must have reset recently
25:50 This changed recently? Look at git log
26:46 Looks like a dodgy merge, fixed. That’ll learn me for not running specs
28:15 Tackle view specs, long time since I’ve used these.
29:06 Be easier if I had factories, look for them.
29:23 Find them under cucumber
30:11 Extract valid_comment_attributes to spec_helper.rb
32:15 Fix broken undo logic
33:00 Extracting common factory logic
33:08 hmm, can you super from a method defined inside a spec?
33:30 yeah, apparently
35:28 working, check in
36:00 Fixing view specs
36:30 Remove approved_comments_count, don’t do spam checking anymore
37:15 Actually it is still there. Need to fix mocks.
39:15 Fix deprecations while waiting for specs.
39:30 Missing template
40:15 Need to use render :template
40:40 Check in, fixed view specs.
41:05 Running specs, looking all green. Fix RAILS_ENV to Rails.env
41:45 All green!

Part 2

0:30 Removing sphinx
2:20 Add pg gem
4:00 Create databases
4:45 Ah it’s postgres, not pg in database.yml
5:15 derp, postgresql
6:00 What are defensio migrations still doing hanging around?
6:45 Move database migrations around to not collide
7:45 taps
8:40 run tests against PG in background
9:30 don’t have open id columns in prod, it was removed in latest enki
11:25 ffffuuuuuu migrations and schema.rb
12:40 taps install failed on rhnh.net, why installing sqlite?
14:00 Argh can’t parse yaml
14:45 Abort taps remotely, bring mysqldump locally
16:00 Try taps locally
17:20 404 :(
17:50 it’s away!
18:10 Invalid encoding UTF-8, dammit.
18:30 New plan, there’s a different gem that does this.
19:00 What is it? I did it in a screencast, I should know this.
19:40 Found it! mysql2psql
20:20 taps, you’re cut
21:00 Setup mysql2psql.yml config
22:20 Works. That was much easier.
23:20 delayed_job, why is that here? Try removing it.
23:50 Used to use it for spam checking, but not anymore.
24:10 Time to replace search, how to do this?
25:00 Index tag list?
26:00 Hmm need full text search as well.
26:15 Step one: normal search, on title and body
27:00 Spec it, extract faux-factory for posts
29:00 Failing spec, implement
30:00 Search for PG full text search syntax
31:30 Passing, add in title search also
32:40 Passing with title as well
33:10 Adding tag cache to posts for easy searching
36:10 Argh migrations are screwed.
36:40 Move migrations back to where they were
39:09 Amend migration move like it never happened
38:45 Add data migration to tag_cache migration
39:30 WTF already have a tag cache. Where did it come from?
39:40 Delete everything I just did.
41:40 Check in web interface, works.

Part 3

00:20 related posts using full text search
02:55 sort by rank, reading docs
03:50 difference between ts_rank and ts_rank_cd?
4:30 Too hard, just pick one and see what happens
5:15 Syntax error in ts_query
5:45 plainto_tsquery
6:40 working, need to use or rather than and
10:30 Ah, using plainto, fix that.
11:04 Order by rank
12:20 syntax error, need to interpolate keywords
13:45 Search for how to escape SQL string in Activerecord
14:15 Find interpolate_sql, looks promising
14:50 Actually no, find sanitize_sql_array
15:20 Just try it, works. Click around to verify.
16:45 Add spec
21:20 Passing specs, commit
21:45 Why isn’t tagging working?
23:30 Ah, probably case insensitive. Need to use ILIKE.
24:00 Write a test for it
26:00 Have a failing test
26:30 Argh it’s inside acts_as_taggable_on_steroids plugin
27:20 Override the method directly in model, just for now
28:30 Commit that
29:00 Remove searchable_tags
32:00 Fix tags with spaces
34:00 Exclude popular tags from search (fix the wrong thing)
35:40 Back to fixing tags with spaces
37:20 Looking at rankings, good enough for now
38:00 Move sphinx namespace into rhnh

Part 4

00:30 Checking docs for new Cedar stack
1:30 Search for how to import data
2:20 pg_dump of data
2:50 Move dump to public Dropbox so heroku can access it
3:40 Push code to heroku
4:50 Taking a while, hmm repo is big
5:50 Clone a copy to tmp, check if it’s still big.
6:00 Yeah, eh not a big deal, it’s been a while a number of years.
7:00 heroku push done, run heroku ps. Crashed :(
7:30 AWS? I deleted you >:[
8:00 Argh I pushed master, not my branch
9:30 heroku ps, crashed again
10:30 Unclear, probably exception notifier, remove it
11:30 add thin gem while waiting
12:30 Running, expect not to work because database not set up
13:05 Create procfile
13:35 Import pg backup
15:20 Working, click around, make sure it’s working
16:20 Check whether atom feed is working
17:30 Check exception notifications
19:00 Either new comments, or something is wrong.
19:20 Yep new comments, need to reimport data. Do that later.
20:00 Back to exception notification. Used to be an add-on.
21:20 Don’t want hoptoad or get exceptional, maybe sendgrind with exception notifier?
22:00 Searching for examples.
22:20 Found stack overflow answer, looks promising.
24:20 Bring back exception notifier with sendgrind.
26:00 logs show sent mail, arrives in email
26:15 Next steps, DNS settings, extra database dump.

Automatically pushing git repositories to Bitbucket

Bitbucket gives you unlimited private repositories. It’s the perfect place to archive all my crap to. Here is a script to create remotes for all repositories in a folder and push them up. I had 38 of them.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
$usr    = "xaviershay"
$remote = "bitbucket"

def main
  directories_in_cwd.each do |entry|
    existing_remotes = remotes_for(entry)

    action_performed = if existing_remotes
      if already_added?(existing_remotes)
        "EXISTING"
      else
        create_remote_repository(entry)
        push_local_repository_to_remote(entry)
        "ADD"
      end
    else
      "SKIP"
    end

    puts action_performed + " #{entry}"
  end
end

def directories_in_cwd
  Dir.entries(".").select {|entry|
    File.directory?(entry) && !%w(. ..).include?(entry)
  }
end

def remotes_for(entry)
  gitconfig = "#{entry}/.git/config"
  return unless File.exists?(gitconfig)
  existing_remotes = `cat #{gitconfig} | grep "url ="`.split("\n")
end

def already_added?(existing)
  existing.any? {|x| x.include?($remote) }
end

def create_remote_repository(entry)
  run %{curl -s -i --netrc -X POST -d "name=#{entry}" } +
          %{-d "is_private=True" -d "scm=git" } +
          %{https://api.bitbucket.org/1.0/repositories/}
end

def push_local_repository_to_remote(entry)
  Dir.chdir(entry) do
    run "git remote add #{$remote} git@bitbucket.org:#{$usr}/#{entry}.git"
    run "git push #{$remote} master"
  end
end

def run(cmd)
  `#{cmd}`
end

main

So you aren’t prompted for username and password every time, you should create a `.netrc` file.

1
2
> cat ~/.netrc
machine api.bitbucket.org login xaviershay password notmyrealpassword

DataMapper Retrospective

I introduced DataMapper on my last two major projects. As those projects matured after I had left, they both migrated to a different ORM. That deserves a retrospective, I think. As I’ve left both projects, I don’t have the insider level of detail on the decision to abandon DataMapper, but developers from both projects kindly provided background for this blog post.

Project A

Web application and a batch processing component built on top of a legacy Oracle database.

Good

  • Field mappings, nice ruby names and able to ignore fields we didn’t care about.

Bad

  • Had to roll our own locking and time zone integration.
  • Not great for batch processing (trying to write SQL through DM abstraction.)

It turned out this project required a lot more batch processing than we anticipated, which DataMapper does not shine at. It was migrated to Sequel which provides a far better abstraction for working closer to SQL.

Project B

A fairly typical Rails 3 application. A couple of tens of thousands of lines of code.

Good

  • No migrations (pre-release).
  • Foreign keys, composite primary keys.
  • Auto-validations.

Bad

  • Auto-validations with nested attributes was uncharted territory (needed bug fixes).
  • Performance on large object graphs was unusable for page rendering (close to two seconds for our home page, which admittedly had a stupid amount of stuff on it).
  • Performance was suboptimal (though passable) on smaller pages.
  • Tracing through what his happening across multiple gems (particularly around transactions) was tricky.
  • The maintenance/interactions of all the various gems was problematic (e.g. gems X,Y work with 1.9.3 but Z doesn’t yet).
  • Inability to easily “break the abstraction” when SQL was required.

The performance issues were clear in our code base, but eluded much effort to reduce them down to smaller reproducible problems. The best quick win I found was ~15% by disabling assertions, but I suspect that given the large scope of the problem DataMapper is trying to solve there may not be any approachable way of tackling the issue (would love to be proven wrong!)

We ran into obvious integration bugs (apologies for not having kept a concrete list), a symptom of a library not widely used. As a commiter on the project this wasn’t an issue, since they were easily fixed and moved past (the DataMapper code base is really nice to work on), but having a commiter on your team isn’t a tenable strategy.

DataMapper takes an all-ruby-all-the-time approach, which means things get tricky when the abstraction leaks. Much of the SQL generation is hidden in private methods. Compare some code to create a composable full text search query:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def self.search(keywords, options = {})
  options = {
    conditions: ["true"]
  }.merge(options)

  current_query = query.merge(options)

  a           = repository.adapter
  columns_sql = a.send(:columns_statement,    current_query.fields,     false)
  conditions  = a.send(:conditions_statement, current_query.conditions, false)
  order_sql   = a.send(:order_statement,      current_query.order,      false)
  limit_sql   = current_query.limit || 50
  conditions_sql, conditions_values = *conditions

  bind_values = [keywords] + conditions_values

  find_by_sql([<<-SQL, *bind_values])
    SELECT #{columns_sql}, ts_rank_cd(search_vector, query) AS rank
    FROM things
    CROSS JOIN plainto_tsquery(?) query
    WHERE #{conditions_sql} AND (query @@ search_vector)
    ORDER BY rank DESC, #{order_sql}
    LIMIT #{limit_sql}
  SQL
end

To the ActiveRecord equivalent (Sequel is similar):

1
2
3
4
5
6
def self.search(keywords)
  select("things.*, ts_rank_cd(search_vector, query) AS rank")
    .joins(sanitize_sql_array(["CROSS JOIN plainto_tsquery(?) query", keywords]))
    .where("query @@ search_vector")
    .order("rank DESC")
end

Switching to ActiveRecord took a week of all hands (~4) on deck, plus another week alongside other feature work to get it stable. From beginning to in production was two weeks. The end result was a drop in response time (the deploy is pretty blatant in the graph below), start up time, plus 3K less lines of code (a lot of custom code for dropping down to SQL was able to be removed).

Do differently

Ultimately, DataMapper provides an abstraction that I just don’t need, and even if I did it hasn’t had its tires kicked sufficiently that a team can use it without having to delve down to the internals. The applications I find myself writing are about data, and the store in which that data lives is vitally important to the application. Abstracting away those details seems to be heading in the wrong direction for writing simple applications. As an intellectual achievement in its own right I really dig DataMapper, but it is too complicated a component to justify using inside other applications.

Rich Hickey’s talk Simple Made Easy has been rattling around my head a lot.

Nowadays I’m back to ActiveRecord for team conformance. It’s more work to keep on top of foreign keys and the like, but overall it does the job. It’s still too complicated, but has the non-trivial benefit of being used by lots of people. This is my responsible choice at the moment.

On my own projects I first reach for Sequel. It supports all the nice database features I want to use, while providing a thin layer over SQL. In other words, I don’t have to worry about the abstraction leaking because the abstraction is still SQL, just expressed in ruby (which is a huge win for composeability that you don’t get with raw SQL). While it does have “ORM” features, it feels more like the most convenient way of accessing my database rather than an abstraction layer. It’s actively maintained and the only bug I have found was something that Rails broke, and a patch was already available. There are no open issues in the bug tracker. My experiences have been overwhelmingly positive. I haven’t built anything big enough with it yet to have confidence using it on a team project though.

I still have a soft spot in my heart for DataMapper, I just don’t see anywhere for me to use it anymore.

Exercises in style

Let us make a stack machine! It can add numbers! This may be a winding journey. Have some time and an irb up your sleeve. Maybe it is more of a meditation than a blog post? Onwards!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def push_op(value)
  lambda {|x| [value, x + [value]] }
end

def add_op
  lambda {|x| [x[-1] + x[-2], x[0..-3]] }
end

[
  push_op(1),
  push_op(2),
  add_op
].inject([nil, []]) {|(result, state), op|
  op[state]
}

Get it? Pushes 1, pushes 2, then the add_op pops them off the stack and makes 3. Not a lot of metadata in those lambdas though, and we can’t combine them in interesting way.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
class Operation < Struct.new(:block)
  def +(other)
    CompositeOperation.new(self, other)
  end

  def run(state)
    @block.call(state)
  end
end

class CompositeOperation < Operation
  def initialize(a, b)
    @a = a
    @b = b
    super(lambda {|x| @b.block[@a.block[x][1]] })
  end

  def desc
    @a.desc + "\n" + @b.desc
  end
end

class PushOperation < Operation
  def initialize(value)
    @value = value
    super(lambda {|x| [value, x + [value]] })
  end

  def desc
    "push #{@value}"
  end
end

class AddOperation < Operation
  def initialize
    super(lambda {|x| [x[-1] + x[-2], x[0..-3]] })
  end

  def desc
    "add top two digits on stack"
  end
end

A lot more setup, but now we also get a description of operations!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def tagged_push_op(value)
  PushOperation.new(value)
end

def tagged_add_op
  AddOperation.new
end

ops =
  tagged_push_op(1) +
  tagged_push_op(2) +
  tagged_add_op

puts ops.desc
puts ops.run(start_state).inspect

Ok you get that. What else can we do?

“every monad [.] embodies a particular computational strategy. A ‘motto of computation,’ if you will.”Mental Guy

hmmm. What does it mean?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
class VerboseStackEvaluator < Struct.new(:stack)
  attr_accessor :result, :stack

  def pass(op)
    puts op.desc
    results = op.call(stack)
    self.class.new(results[1]).tap do |x|
      x.result = results[0]
    end
  end

  def self.identity
    new([])
  end
end

e = evaluator.identity.
  pass(tagged_push_op(1)).
  pass(tagged_push_op(2)).
  pass(tagged_add_op)

p [e.result, e.stack]

Oh so now we have one structure (the pass stuff) that we can run through different evaluators. Let us make a recursive one!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
class RecursiveLazyStackEvaluator < Struct.new(:stack)
  def pass(op)
    self.class.new(lambda {
      op.call(stack)
    })
  end

  def self.identity
    new(lambda { [nil, []] })
  end

  def result; evaled[0]; end
  def stack;  evaled[1]; end

  private

  def evaled
    @evaled ||= @stack.call
  end
end

Do you see it is now lazy. Rather than evaluate each operation when pass is called, it saves them up until a result is requested. Look out! Haskell in your Ruby! Recursion might blow out our stack though. Let us isomorphically (I just learned this word) translate it to use iteration!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
class LazyStackEvaluator
  attr_accessor :steps

  def initialize(stack, steps = [])
    @stack  = stack
    @steps  = steps
  end

  def pass(op)
    self.class.new(@stack, steps + [op])
  end

  def self.identity
    new([])
  end

  def result; evaled[0]; end
  def stack;  evaled[1]; end

  protected

  def evaled
    @evaled ||= steps.inject([nil, @stack]) {|(r, s), op|
      op.call(s)
    }
  end
end

Not too shabby. Let’s try something more useful. Given we only have one operation that pops the stack (add), and it only pops two numbers, if we have more than two numbers in a row they start becoming redundant. Let us optimize!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
class OptimizingEvaluator < LazyStackEvaluator
  def evaled
    @evaled ||= begin
      accumulator = []
      new_steps   = []
      steps.each do |step|
        accumulator << step
        if !step.is_a?(PushOperation)
          new_steps += accumulator
          accumulator = []
        elsif accumulator.length > 2
          accumulator = accumulator[1..-1]
        end
      end
      new_steps += accumulator
      new_steps.inject([nil, @stack]) {|(r, s), op|
        op.call(s)
      }
    end
  end
end

e = evaluator.identity.
  pass(tagged_push_op(1)). # This won't get run!
  pass(tagged_push_op(1)).
  pass(tagged_push_op(2)).
  pass(tagged_add_op)

p [e.result, e.stack]

Ok one more. This one is pretty useless for this problem, but perhaps it will inspire thought. Let us multithread!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
class ThreadingEvaluator < LazyStackEvaluator
  def evaled
    @evaled ||= begin
      accumulator = []
      workers     = []
      steps.each do |step|
        accumulator << step
        if step.is_a?(AddOperation)
          workers << spawn_thread(accumulator)
          accumulator = []
        end
      end
      workers << spawn_thread(accumulator) unless accumulator.empty?
      workers.each(&:join)

      workers.last[:result]
    end
  end

  def spawn_thread(accumulator)
    Thread.new do
      sleep rand / 3
      Thread.current[:result] = begin
        e = accumulator.inject(VerboseStackEvaluator.identity) {|e, s| e.pass(s) }
        [e.result, e.stack]
      end
    end
  end
end

e = evaluator.identity.
  pass(tagged_push_op(1)).
  pass(tagged_push_op(1)).
  pass(tagged_push_op(2)).
  pass(tagged_add_op).
  pass(tagged_push_op(3)).
  pass(tagged_push_op(4)).
  pass(tagged_add_op)

p [e.result, e.stack]

Ok that is all. Here is an exercise for you: how would you allow the threading and optimizing evaluators to be combined?

Interface Mocking

UPDATE: This is a gem now: rspec-fire The code in the gem is better than that presented here.

Here is a screencast I put together in response to a recent Destroy All Software screencast on test isolation and refactoring, showing off an idea I’ve been tinkering around with for automatic validation of your implicit interfaces that you stub in tests.

Interface Mocking screencast.

Here is the code for InterfaceMocking:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
module InterfaceMocking

  # Returns a new interface double. This is equivalent to an RSpec double,
  # stub or, mock, except that if the class passed as the first parameter
  # is loaded it will raise if you try to set an expectation or stub on
  # a method that the class has not implemented.
  def interface_double(stubbed_class, methods = {})
    InterfaceDouble.new(stubbed_class, methods)
  end

  module InterfaceDoubleMethods

    include RSpec::Matchers

    def should_receive(method_name)
      ensure_implemented(method_name)
      super
    end

    def should_not_receive(method_name)
      ensure_implemented(method_name)
      super
    end

    def stub!(method_name)
      ensure_implemented(method_name)
      super
    end

    def ensure_implemented(*method_names)
      if recursive_const_defined?(Object, @__stubbed_class__)
        recursive_const_get(Object, @__stubbed_class__).
          should implement(method_names, @__checked_methods__)
      end
    end

    def recursive_const_get object, name
      name.split('::').inject(Object) {|klass,name| klass.const_get name }
    end

    def recursive_const_defined? object, name
      !!name.split('::').inject(Object) {|klass,name|
        if klass && klass.const_defined?(name)
          klass.const_get name
        end
      }
    end

  end

  class InterfaceDouble < RSpec::Mocks::Mock

    include InterfaceDoubleMethods

    def initialize(stubbed_class, *args)
      args << {} unless Hash === args.last

      @__stubbed_class__ = stubbed_class
      @__checked_methods__ = :public_instance_methods
      ensure_implemented *args.last.keys

      # __declared_as copied from rspec/mocks definition of `double`
      args.last[:__declared_as] = 'InterfaceDouble'
      super(stubbed_class, *args)
    end

  end
end

RSpec::Matchers.define :implement do |expected_methods, checked_methods|
  match do |stubbed_class|
    unimplemented_methods(
      stubbed_class,
      expected_methods,
      checked_methods
    ).empty?
  end

  def unimplemented_methods(stubbed_class, expected_methods, checked_methods)
    implemented_methods = stubbed_class.send(checked_methods)
    unimplemented_methods = expected_methods - implemented_methods
  end

  failure_message_for_should do |stubbed_class|
    "%s does not publicly implement:\n%s" % [
      stubbed_class,
      unimplemented_methods(
        stubbed_class,
        expected_methods,
        checked_methods
      ).sort.map {|x|
        "  #{x}"
      }.join("\n")
    ]
  end
end

RSpec.configure do |config|

  config.include InterfaceMocking

end

Static Asset Caching on Heroku Cedar Stack

UPDATE: This is now documented at Heroku (thanks Nick)

I recently moved this blog over to Heroku, and in the process added in some proper HTTP caching headers. The dynamic pages use the build in fresh_when and stale? Rails helpers, combined with Rack::Cache and the free memcached plugin available on Heroku. That was all pretty straight forward, what was more difficult was configuring Heroku to serve all static assets (such as images and stylesheets) with a far-future max-age header so that they will be cached for eternity. What I’ve documented here is somewhat of a hack, and hopefully Heroku will provide a better way of doing this in the future.

By default Heroku serves everything in public directly via nginx. This is a problem for us since we don’t get a chance to configure the caching headers. Instead, use the Rack::StaticCache middleware (provided in the rack-contrib gem) to serve static files, which by default adds far future max age cache control headers. This needs to be out of different directory to public since there is no way to disable the nginx serving. I renamed by public folder to public_cached.

1
2
3
4
5
6
7
8
9
10
# config/application.rb
config.middleware.use Rack::StaticCache, 
  urls: %w(
    /stylesheets
    /images
    /javascripts
    /robots.txt
    /favicon.ico
  ),
  root: "public_cached"

I also disabled the built in Rails serving of static assets in development mode, so that it didn’t interfere:

1
2
# config/environments/development.rb
config.serve_static_assets = false

In the production config, I configured the x_sendfile_header option to be “X-Accel-Redirect”. It was “X-Sendfile” which is an apache directive, and was causing nginx to hang (Heroku would never actually serve the assets to the browser).

1
2
# config/environments/production.rb
config.action_dispatch.x_sendfile_header = 'X-Accel-Redirect'

A downside of this approach is that if you have a lot of static assets, they all have to hit the Rails stack in order to be served. If you only have one dyno (the free plan) then the initial load can be slower than it otherwise would be if nginx was serving them directly. As I mentioned in the introduction, hopefully Heroku will provide a nicer way to do this in the future.

A pretty flower Another pretty flower