Robot Has No Heart

Xavier Shay blogs here

A robot that does not have a heart

Selenium, webrat and the firefox beta

I needed a few hacks to get selenium running with webrat.

First, make sure you are running at least 0.4.4 of webrat. Don’t make the same mistake I did and upgrade your gem version, but not the plugin installed in vendor/plugins.

1
2
3
gem install webrat
gem install selenium-client
gem install bmabey-database_cleaner --source=http://gems.github.com

There is a trick to get Firefox 3.5 beta working. The selenium server package with webrat 0.4.4 only supports FF 3.0.*. Follow these instructions, patching the jar that is packaged with webrat (vendor/selenium-server.jar) so that the extensions that selenium uses will be valid for the new FF.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
cd vendor/ # In webrat dir
jar xf selenium-server.jar \
customProfileDirCUSTFFCHROME/extensions/readystate@openqa.org/install.rdf
jar xf selenium-server.jar \
customProfileDirCUSTFFCHROME/extensions/{538F0036-F358-4f84-A764-89FB437166B4}/install.rdf
jar xf selenium-server.jar \
customProfileDirCUSTFFCHROME/extensions/\{503A0CD4-EDC8-489b-853B-19E0BAA8F0A4\}/install.rdf 
jar xf selenium-server.jar \
customProfileDirCUSTFF/extensions/readystate\@openqa.org/install.rdf 
jar xf selenium-server.jar \
customProfileDirCUSTFF/extensions/\{538F0036-F358-4f84-A764-89FB437166B4\}/install.rdf

replace "3.0.*" "3.*" -- `find . | grep rdf`

jar uf selenium-server.jar \
customProfileDirCUSTFFCHROME/extensions/readystate@openqa.org/install.rdf
jar uf selenium-server.jar \
customProfileDirCUSTFFCHROME/extensions/{538F0036-F358-4f84-A764-89FB437166B4}/install.rdf
jar uf selenium-server.jar \
customProfileDirCUSTFFCHROME/extensions/\{503A0CD4-EDC8-489b-853B-19E0BAA8F0A4\}/install.rdf 
jar uf selenium-server.jar \
customProfileDirCUSTFF/extensions/readystate\@openqa.org/install.rdf 
jar uf selenium-server.jar \
customProfileDirCUSTFF/extensions/\{538F0036-F358-4f84-A764-89FB437166B4\}/install.rdf

(hat tip to space vatican)

I haven’t been able to get Safari working yet.

I want to run selenium tests besides normal webrat tests, so I created a new environment “acceptance” that I can run tests under. Modify your test helper file:

1
2
3
4
5
6
7
8
# test/test_helper.rb
ENV["RAILS_ENV"] ||= "test"
raise "Can't run tests in #{ENV['RAILS_ENV']} environment" unless %w(test acceptance).include?(ENV["RAILS_ENV"])

require 'webrat'
require "test/env/#{ENV["RAILS_ENV"]}"

# ...
1
2
3
4
5
6
7
# test/env/test.rb
require 'webrat/rails'

Webrat.configure do |config|
  config.mode = :rails
  config.open_error_files = false
end
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# test/env/acceptance.rb
require 'webrat/selenium/silence_stream'
require 'webrat/selenium'
require 'test/selenium_helpers'
require 'test/element_helpers'

# Required because we aren't isolating tests inside a transaction
require 'database_cleaner'
DatabaseCleaner.strategy = :truncation

Webrat.configure do |config|
  config.mode = :selenium
end

class ActionController::IntegrationTest
  self.use_transactional_fixtures = false # Necessary, otherwise selenium will never see any changes!

  setup do |session|
    session.host! "localhost:3001"
  end

  teardown do
    DatabaseCleaner.clean
  end
end

# Hack: webrat requires this, even though we're not using rspec
module Spec
  module Expectations
    class ExpectationNotMetError < Exception
    end
  end
end
1
2
3
4
5
6
7
8
9
10
11
# lib/tasks/test.rake
namespace :test do
  task :force_acceptance do
    ENV["RAILS_ENV"] = 'acceptance'
  end

  Rake::TestTask.new(:acceptance => :force_acceptance) do |t|
    t.test_files = FileList['test/acceptance/*_test.rb']
    t.verbose = true
  end
end

Notes

  • selenium and javascript helpers are from pivotallabs pat, they’re really handy for testing visibilty of DOM elements
  • there’s some magic in webrat to conditionally require silence_stream based on something in active_support. I don’t understand it quite enough, but requiring it explicitly was necessary to get things running for me
  • webrat/selenium assumes some classes are loaded that only happens if you’re using rspec. I’m not, so stubbed out the ExpectationNotMetError (it is only referred to in a rescue block).
  • rake test:acceptance runs the selenium tests. Running acceptance tests directly as a ruby script runs them using normal webrat – this is actually handy when writing tests because you get a quicker turnaround.
  • to pause selenium mid test run (to see wtf is going on), just add gets at the appropriate line in your test

Faster rails testing with ruby_fork

A long running test suite isn’t the problem. Your build server can take care of that. A second or two here or there, no one notices.

The killer wait is in the red/green/refactor loop. You’re only running one or two tests, and an extra second can mean the difference between getting into flow or switching to twitter. And you know what kills you in rails?

1
2
3
4
5
$ time ruby -e '' -r config/environment.rb

real    0m3.784s
user    0m2.707s
sys     0m0.687s

Yep, the environment. That’s a lot of overhead to be waiting for everytime you run a test, especially since it’s the same code every time! You fix this with a clever script called ruby_fork that’s included in the ZenTest package. It loads up your environment, then just chills out, waiting. You send a ruby file to it, and it forks itself (the process containing the environment) to execute that file. The beauty of this is that forking is really quick, and it leaves a pristine copy of the environment around for the next test run.

‘Environment’ doesn’t just have be environment.rb, for bonus points you can load up test_helper.rb, which will also load your testing framework into memory. In fact, you can preload any ruby code at all – ruby_fork isn’t rails specific.

1
2
3
4
5
6
7
8
9
10
11
12
13
$ ruby_fork -r test/test_helper.rb &
/opt/local/bin/ruby_fork Running as PID 526 on 9084

$ time ruby_fork_client -r test/unit/your_test.rb
Started
...
Finished in 0.565636 seconds. # Aside: this time is bollocks

3 tests, 4 assertions, 0 failures, 0 errors

real    0m0.972s # This is the time you're interested in
user    0m0.225s
sys     0m0.035s

That’s fantastic, though you’ll notice in newer versions of rails your application code is not reloaded. By default your test environment caches classes – which normally isn’t a problem except that newer rails versions also eager load those classes (so they’re loaded when you load enviornment.rb). You can fix this by clearing out the eager load paths in your test environment file:

1
2
# config/environments/test.rb
config.eager_load_paths = []

On my machine this gets individual test runs down from about 4 seconds to less than 1 second. You can sell that to your boss as a four-fold productivity increase.

Testing Glue Code

db2s3 combines together 3 external dependencies – your database, the filesystem, and Amazon’s S3 service. It has 1 conditional in the main code path (and it’s not even an important one). The classic unit testing approach of “stub everything” provides little benefit.

Unit testing is good for ensuring complex code paths execute properly, that edge cases are properly explored, and for answering the question “what broke?”. For trivial glue code, none of these are of particular benefit. There are no complex code paths or edge cases, and it will be quickly obvious what broke. In fact, the most likely thing to “break” (or change) over time isn’t your code, but the external services it is sticking together, which stubs cannot protect you from. Considering the high relative cost of stubbing out all your dependencies, unit testing becomes an expensive way of testing something quite simple.

For glue code, integration tests are the best solution. Glue code needs to stick, and integration tests ensures that it does. Here is the only test that matters from db2s3:

1
2
3
4
5
6
7
8
9
it 'can save and restore a backup to S3' do
  db2s3 = DB2S3.new
  load_schema
  Person.create!(:name => "Baxter")
  db2s3.full_backup
  drop_schema
  db2s3.restore
  Person.find_by_name("Baxter").should_not be_nil
end

This test costs money to run since it hits the live S3 service, but only in the academic sense. The question you need to ask is “would I pay one cent to have confidence my backup solution works?”

Always remember why your are testing. Unit tests are a focussed tool, and not always necessary.

Backup MySQL to S3 with Rails

Here is some code I wrote over the weekend – db2s3. It’s a rails plugin that provides rake tasks for backing up your database and storing it on Amazon’s S3 cloud storage. S3 is a trivially cheap offsite backup solution – for small databases it costs about 4 cents per month, even if you’re sending full backups every hour.

There are many scripts around that do this already, but they fail to address the biggest actual problem. The aws-s3 gem provides a really nice ruby interface to S3, and dumping a backup then storing it really isn’t that hard. The real problem is that I really hate system administration. I want to spend as little time as possible and I want things to Just Work.

A script is great but there’s still too many things for me to do. Where does it go in my project? How do I set my credentials? How do I call it?

That’s why a plugin was needed. It’s as little work as possible for a rails developer to backup their database, so they can get back to making their app awesome.

db2s3. Check it out.

Singleton resource, pluralized controller fix for rails

map.resource still looks for a pluralized controller. This has always bugged me. Here’s a quick monkey patch to fix. Tested on rails 2.2.2.

1
2
3
4
5
6
7
8
9
10
11
12
13
# config/initializers/singleton_resource_fix.rb
module ActionController
  module Resources
    class SingletonResource < Resource #:nodoc:
      def initialize(entity, options)
        @singular = @plural = entity
        # options[:controller] ||= @singular.to_s.pluralize
        options[:controller] ||= @singular.to_s # This is the only line to change
        super
      end
    end
  end
end
1
2
3
4
5
6
# config/routes.rb
# before fix
map.resource :session, :controller => 'sessions'

# after fix
map.resource :session

Evolution of a graph

Recently I have wanted to chart some cost data I collected on various foods. As a baseline for discussion, here is a very vanilla excel type graph, reminiscent of ones I am certain you have seen in powerpoint presentations:

This is not a good graph for several reasons

  • Only provides a general overview of the data – some foods are cheaper, some more expensive, so what?
  • Labels feel cramped and ugly.
  • The grid is too prominent and distracting, without being very helpful – you can’t read accurate values from it.

The biggest problem is that it doesn’t “invite the eye to compare”. It doesn’t leave an impact. The first step to addressing this is to revisit the data – it’s quite possible you just have boring data. In this case, I improved the data by coding it according to whether it is vegetarian or not.

Version 2

For the next iteration of this graph, I colored the graph to highlight the vegetarian aspect of the food. To address the other issues, I moved the labels into the legend, and completely removed the grid, instead displaying the values directly on the graph. This technique works due to the low number of data points. You can think of it has “enhancing” the table rather than displaying a high level overview of it. Also, a serif font (georgia) was used.

This is certainly an improvement, but it still has its flaws

  • 8 different colors, which distracts from the data, and the vegetarian data is muted.
  • It is much harder to identify the food with the data point, now that the labels have been moved into the legend.

Final

I iterated again, moving the labels back down to the x-axis, which in addition to solving the identification problem, allowed me to drop back down to 2 colours. In our initial graph this felt cramped, so I added some more whitespace and also kept the serif font from the last iteration.

This version of the graph speaks much louder. It’s easier on the eye, and the conclusion I want to draw from the data is clearly expressed. I am using this graph (with proper references and notes) on a new information site I’m working on – it’s far from complete but you can follow along on github if you’re interested.

Tools

The first graph was made with OpenOffice spreadsheet, the second with a hacked version of flot for jQuery. The final graph was made with a new jQuery plugin I wrote called tufte-graph. There is a meta-lesson here – I spent hours hacking different JS libraries to try and get them working exactly how I wanted, in the end the quickest solution was to just write what I needed.

I use Colour Lovers to find color nice colour palettes. Works much better than trying random RGB codes.

Final word

Spend time on your graphs. A picture is worth a thousand words. They are too often neglected, and it doesn’t take much effort to make them really shine.

inject and collect with jQuery

You know, I would have thought someone had already made an enumerable plugin for jQuery. Maybe someone has. Mine is better.

  • Complete coverage with screw-unit
  • Interface so consistent with jQuery you’ll think it was core
1
2
3
4
squares = $([1,2,3]).collect(function () {
  return this * this;
});
squares // => [1, 4, 9]

It’s on github. It deliberately doesn’t have the kitchen sink – fork and add methods you need, there’s enough code it should be obvious the correct way to do it.

As an aside, it’s really hard to spec these methods concisely. I consulted the rubyspec project and it turns out they had trouble as well, check out this all encompassing spec for inject: “Enumerable#inject: inject with argument takes a block with an accumulator (with argument as initial value) and the current element. Value of block becomes new accumulator”. Bit of a mouthful eh.

Post your improvements in the comments.

Code for Christmas

Developers don’t have enough time.

We’re all too busy working our day job, or looking after our better half, to give our pet projects the attention they deserve.

That makes time the most valuable thing we can give. This year for Christmas, why not give a fellow developer some?

Ticking off an amazon wishlist never really resonated with me, so this year here is what we are all doing instead:

  1. Find someone’s pet open source project – I’d start at github
  2. Contribute! It doesn’t have to be much – a spec or two, some documentation, or even just a “hey it works on my box”. Fork, commit, pull request.
  3. Wish them a Merry Christmas!

That shouldn’t take you more than an hour. It’s a total win all around – you get to hone your chops, they get some love on their project, and the open source ecosphere is improved. If you’re feeling generous, or don’t have any friends, there’s no shortage of projects that I’m sure would welcome some support.

My wishlist is any of the ruby midi projects out there.

Unique data in dm-sweatshop

dm-sweatshop is how you set up test data for your datamapper apps. Standard practice is to generate random data that follows a pattern:

1
2
3
4
5
User.fix {{
  :login  => /\w+/.gen
}}

new_user = User.gen

Let’s not now debate whether or not random data in tests is a good idea. What’s more important is that the above code should make you uneasy if login is supposed to be unique. There was a hack in sweatshop that would try recreating the data if you had a uniqueness constraint on login and it was invalid, but it was exactly that: a hack. As of a few days ago (what will be 0.9.7), you need to be more explicit if you want unique data. It’s pretty easy:

1
2
3
4
5
include DataMapper::Sweatshop::Unique

User.fix {{
  :login  => unique { /\w+/.gen }
}}

Tada! You can also easily get non-random unique data by providing a block with one parameter. Check the README for this and other cool things you can do.

Introducing SocialBeat (screencast)

Here is a screencast of socialbeat in which you will note:

  1. I don’t appear drunk
  2. I don’t reveal intra-company communications
  3. I show off the full gamut of socialbeat’s awesomeness in under 3 minutes

In these ways you may find it superior to other screencasts you may have seen on the matter.


Introducing SocialBeat

If you are behind the times – socialbeat is some code that lets you live code OpenGL visualizations to MIDI tracks.

Comparing lambdas in ruby

to_ruby is a really convenient way to compare the equality of two lambdas. It’s a bit slow though. If we get our hands dirty (only a little!) with ParseTree, we can get a result 2 orders of magnitude quicker. I’d be interested to see if these benchmarks differ significantly on other versions of ruby.

1
2
~ $ ruby -v
ruby 1.8.6 (2007-09-23 patchlevel 110) [i686-darwin8.11.1]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
require 'benchmark'
require 'parse_tree'
require 'ruby2ruby'

def gen_lambda
  lambda {|x| x + 1 }
end

Parser = ParseTree.new(false)

# This only requires parse tree, not ruby2ruby
def proc_identity(block)
  klass = Class.new
  name = "myproc"
  klass.send(:define_method, name, &block)

  # .last ignores the method name and definition - they're irrelevant
  Parser.parse_tree_for_method(klass, name).last 
end

n = 1000
Benchmark.bmbm do |x|
  x.report("#to_ruby") { n.times { gen_lambda.to_ruby == gen_lambda.to_ruby }}
  x.report("#to_sexp") { n.times { gen_lambda.to_sexp == gen_lambda.to_sexp }}
  x.report("manual")   { n.times { proc_identity(gen_lambda) == proc_identity(gen_lambda) }}
end
1
2
3
4
               user     system      total        real
#to_ruby   4.460000   0.220000   4.680000 (  4.695327)
#to_sexp   0.920000   0.190000   1.110000 (  1.110214)
manual     0.030000   0.000000   0.030000 (  0.032768)

In case you were wondering, I was playing around with this while implementing unique data generation for dm-sweatshop

Integration testing with Cucumber, RSpec and Thinking Sphinx

Ideally you would want to include sphinx in your integration tests. It’s really just like your database. In practice, this is problematic. Ensuring the DB is started and triggering a re-index after each model load is doable, if slow, with a small bit of hacking of thinking sphinx (hint – change the initializer for the ThinkingSphinx::Configuration to allow you to specify the environment). Here’s the rub though – if you’re using transactional fixtures the sphinx indexer won’t be able to see any of your data! Turning that off can really slow down your tests, and once you add in the re-indexing time you’re going to be making a few cups of coffee while they run.

One approach I’ve been taking is to stub out the search methods with RR. I know, I know, stubbing in your integration tests is evil. I’m being pragmatic here. For most applications your search is trivial (find me results for this keyword), and if you unit test your define_index block you’re pretty well covered. To go one step further you could unit test your controllers with an expect on the search method, or have a separate suite of non-transactional integration tests running against sphinx. I like the latter, but haven’t done it yet.

Enough talk! Here’s the magic you need to get it working with cucumber:

1
2
3
4
5
6
7
8
9
# features/steps/env.rb
require 'rr'
Cucumber::Rails::World.send(:include, RR::Adapters::RRMethods)

# features/steps/*_steps.rb
Given /a car with model '(\w+)' exists/ do |model|
  car = Car.create!(:model => model)
  stub(Car).search(model) { [car] }
end

Capturing output from rake

Rake has an annoying habit of putting it’s own diagnostic line on the first line of output. You can strip that out with tail.

1
rake my_report:xml | tail -n+2 > output.xml

You don't need view logic in models

Jake Scruggs wrote about moving view logic into his models

It’s hard to tell without knowing the full dataset, but my approach to these sort of problems is to reduce the data down to the simplest possible form (usually a hash), and then use an algorithm to extract what I need.

One commenter tried this and I think it’s heading in the right direction. There is potentially quite a lot of duplication here – the repetition of the layouts and scripts. To ease this it can sometimes be easier to inverse the key/values, for a more concise representation. You could reduce this even further if there were sensible defaults (if 90% of cars used a two_column layout, for instance) – just replace the raise in the following code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# See original post for context
# Data
layouts = {
  'two_column'   => [Toyota, Saturn],
  'three_column' => [Hyundai],
  'ford'         => [Ford]
}

scripts => {
  'discount' => [Hyundai, Ford],
  'poll'     => [Saturn]
}

# Algorithm
find_key = lambda {|hash, car| 
  (
    hash.detect {|key, types| 
      types.any? {|type| car.is_a?(type)}
      # types.include?(car.class) if you're not using inheritance
    } || raise("No entry for car: #{car}")
  ).first
}

layout = find_key[layouts, @car]
script = find_key[scripts, @car]

@stylesheets += ['layout', 'theme'].collect {|suffix| "#{layout}_#{suffix}.css" }
@scripts     += ["#{script}.js"]

render :action => find_view, :layout => layout

This is preferable to putting this data in your object hierarchy for all the normal reasons, especially since it keeps view logic where you expect to find it and doesn’t muddy up your models.

Speeding up Rails Initialization

Chad Wooley just posted a tip to get rails starting up faster. Which is real, except it doesn’t work if you’re using ActiveScaffold. This is due to a load ordering problem – ActiveScaffold monkey patches the Resource class used by routes after routes have been parsed the first time, and relies on the re-parsing triggered by the inflections change.

To fix this, you can explicitly require the monkey patch just before you draw your routes (it doesn’t depend on anything else in ActiveScaffold).

1
2
3
4
5
6
7
# config/routes.rb
ActionController::Routing::Routes.draw do |map|
  # Explicitly require this, otherwise it won't get loaded before we parse our resources time
  require 'vendor/plugins/active_scaffold/lib/extensions/resources.rb'

  # Your routes go here...
end

Yes it’s a hack on top of hack, but I get my console 30% quicker, so I’m running with it.

Tested on 2.0.2

A pretty flower Another pretty flower