Ruby debugging with puts, tap and Hirb
I use puts heaps when debugging. Combined with tap, it’s pretty handy. You can jump right in the middle of a method chain without having to move things around into variables.
1 |
x = long.chain.of.methods.tap {|x| puts x }.to.do.something.with
|
I thought hey why don’t I merge the two? And for bonus points, add in Hirb’s table display to format my models nicely. These are fairly personal customizations, and aren’t specific to a project, so I put them in my own ~/.railsrc file rather than each project.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
# config/initializers/developer_specific_customizations.rb if %w(development test).include?(Rails.env) railsrc = "#{ENV['HOME']}/.railsrc" load(railsrc) if File.exist?(railsrc) end # ~/.railsrc require 'hirb' Hirb.enable :pager => false class Object def tapp(prefix = nil, &block) block ||= lambda {|x| x } tap do |x| value = block[x] value = Hirb::View.formatter.format_output(value) || value.inspect if prefix print prefix if value.lines.count > 1 print ":\n" else print ": " end end puts value end end end # Usage (in your spec files, perhaps?) "hello".tapp # => hello "hello".tapp('a') # => a - "hello "hello".tapp(&:length) # => 5 MyModel.first.tapp # => # +----+-------------------------+ # | id | created_at | # +----+-------------------------+ # | 7 | 2009-12-29 00:15:56 UTC | # +----+-------------------------+ # 1 row in set |
Full stack testing rack applications
Herein is described a method for full stack testing CloudKit apps. The same techniques could easily be applied to other rack web application or framework, which is pretty much all the ruby ones these days (rails, sinatra, pancake, etc…) This method is ideal for non-html services. For HTML you’re probably better off just using webrat/selenium.
There are two external services that make up our stack:- CloudKit application
- OpenID server
Both of these are rack applications, so we can start them up using the same method in our spec helper.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
require 'spec' require 'pathname' require Pathname(__FILE__).dirname + 'support/application_server' require Pathname(__FILE__).dirname + 'support/tcp_socket' TEST_PORTS = { :app => 9293, :openid => 9294 } $servers = nil Spec::Runner.configure do |config| config.before(:all) do $servers ||= Support::ApplicationServer.multi_boot( { :config => File.expand_path(Dir.pwd + '/config.ru'), :port => TEST_PORTS[:app], :daemonize => true }, { :config => File.expand_path(Dir.pwd + '/spec/support/rack_my_id.rb'), :port => TEST_PORTS[:openid], :daemonize => true } ) end end |
A global variable is required here, since before(:all) in rspec runs once per describe block, rather than once per test run. An at_exit hook is used to shutdown the services after the test run.
You need a way of resetting your data between test runs. The default CloudKit::MemoryTable does not provide a mechanism for this – any deleted resource will exist in the version history of that resource (and will respond with a 410 rather than 404). By subclassing MemoryTable, we can provide a purge method that does what we need:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
# A custom storage adapter that allows a total purge of a collection # This is handy in test mode to clear out data between specs class PurgeableTable < CloudKit::MemoryTable # Remove all resources in a collection. # Unlike a normal delete, which versions the resource (and sets up a 410 response), # this method removes all trace of the resource (it will 404). # # Example: # CloudKit.setup_storage_adapter(adapter = PurgeableTable.new) # adapter.purge('/items') def purge(collection) query {|q| q.add_condition('collection_reference', :eql, collection) }.each do |item| @hash.delete(@keys.delete(item[:pk])) end end end |
Since we’ll be testing the CloudKit app from a separate process, we also need a way of triggering a purge. An easy way is some custom rack middleware that provides a URL we can hit to reset the app. Clearly, we only want to enable this in test mode.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
class ResetApp def initialize(app, options = {}) @app = app @options = options end def call(env) request = Rack::Request.new(env) if request.path == '/test_reset' && request.request_method == 'POST' @options[:adapter].purge('/items') return Rack::Response.new([], 200).finish else @app.call(env) end end end |
1 2 3 4 5 6 |
# config.ru CloudKit.setup_storage_adapter(adapter = PurgeableTable.new) if ENV["RACK_ENV"] == 'test' use ResetApp, :adapter => adapter end |
Now all the infrastructure is set up, we can test the CloudKit app using familiar ruby HTTP libraries:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
require 'httparty' require 'mechanize' require 'json' require 'oauth' describe 'OAuth + OpendID' do include HTTParty base_uri "localhost:#{TEST_PORTS[:app]}" before(:each) do HTTParty.post("/test_reset").code.should == 200 end specify 'Registering for an oauth token' do @consumer = OAuth::Consumer.new('cloudkitconsumer','', :site => "http://localhost:#{TEST_PORTS[:app]}", :authorize_path => "/oauth/authorization", :access_token_path => "/oauth/access_tokens", :request_token_path => "/oauth/request_tokens" ) @request_token = @consumer.get_request_token agent = WWW::Mechanize.new page = agent.get(@request_token.authorize_url) login_form = page.forms.first login_form.field_with(:name => "openid_url").value = "localhost:#{TEST_PORTS[:openid]}" page = agent.submit(login_form) oauth_form = page.forms.first page = agent.submit(oauth_form, oauth_form.button_with(:value => "Approve")) # Get access token @access_token = @request_token.get_access_token # Update an item result = @access_token.put("/items/12345", {:name => "Hello"}.to_json) result.code.should == "201" end end |
There’s a lot of code and not much supporting text here. I’m hoping it all just clicks together pretty easy. Hit me up with any questions.
BacktraceCleaner and gems in rails
UPDATE: Fixed the monkey-patch to match the latest version of the patch, and to explicitly require Rails::BacktraceCleaner before patching it to make sure it has been loaded
If there’s one thing my mother taught me, if you’re going to clean something up you may as well do it properly. Be thorough, cover every surface.
Rails::BacktraceCleaner is a bit sloppy when it comes to gem directories. It misses all sorts of dust – hyphens, underscores, upper case letters, numbers. That’s not going to earn any pocket money. Let’s teach it a lesson.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
# config/initializers/this_is_what_a_gem_looks_like.rb require 'rails/backtrace_cleaner' module Rails class BacktraceCleaner < ActiveSupport::BacktraceCleaner private GEM_REGEX = "([A-Za-z0-9_-]+)-([0-9.]+)" def add_gem_filters Gem.path.each do |path| # http://gist.github.com/30430 add_filter { |line| line.sub(/(#{path})\/gems\/#{GEM_REGEX}\/(.*)/, '\2 (\3) \4')} end vendor_gems_path = Rails::GemDependency.unpacked_path.sub("#{RAILS_ROOT}/",'') add_filter { |line| line.sub(/(#{vendor_gems_path})\/#{GEM_REGEX}\/(.*)/, '\2 (\3) [v] \4')} end end end |
I’ve submitted a patch to rails, please review if you like.
Kudos to Matthew Todd for pairing with me on this.
Benchmarks for creating a new array
1 2 3 4 5 6 7 8 9 10 11 |
require 'benchmark' n = 1000 m = 50000 blank = [0] * m Benchmark.bm(7) do |x| x.report(".new with block:") { (0..n).collect { Array.new(m) { 0 } }} x.report(" .new no block:") { (0..n).collect { Array.new(m, 0) }} x.report(" [0] * x:") { (0..n).collect { [0] * m }} x.report(" #dup:") { (0..n).collect { blank.dup }} end |
1 2 3 4 5 6 |
$ ruby19 benchmark.rb
user system total real
.new with block: 10.180000 0.210000 10.390000 ( 10.459538)
.new no block: 3.690000 0.210000 3.900000 ( 3.915348)
[0] * x: 4.280000 0.210000 4.490000 ( 4.505334)
#dup: 0.000000 0.000000 0.000000 ( 0.000491)
|
Know your constructors! What is #dup doing? I think it’s cheating.
Acts_as_state_machine locking
consider the following!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
class Door < ActiveRecord::Base acts_as_state_machine :initial => :closed state :closed state :open, :enter => :say_hello event :open do transitions :from => :closed, :to => :open end def say_hello puts "hello" end end door = Door.create! fork do transaction do door.open! end end door.open! # >> hello # >> hello |
It’s broken, you can only open a door once. This is a classic double-update problem. One way to solve is with pessimistic locking. I made some codes that automatically lock any object when you call an event on it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
class ActiveRecord::Base # Forces all state transition events to obtain a DB lock def self.obtain_lock_before_all_state_transitions event_table.keys.each do |transition| define_method("#{transition}_with_lock!") do self.class.transaction do lock! send("#{transition}_without_lock!") end end alias_method_chain "#{transition}!", :lock end end end class Door < ActiveRecord::Base # ... as before obtain_lock_before_all_state_transitions end |
beware! Your state transitions can now throw ActiveRecord::RecordNotFound errors (from lock!), since the object may have been deleted before you got a chance to play with it.
If you’re not using any locking in your web app, you’re probably doing it wrong. Just sayin’.
Range#include? in ruby 1.9
Range#include? behaviour has changed in ruby 1.9 for non-numeric ranges. Rather than a greater-than/less-than check against the min and max values, the range is iterated over from min until the test value is found (or max). This is necessary to cover some edge cases of ranges which are incorrect in 1.8.7, as demonstrated by the following example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
class EvenNumber < Struct.new(:value) def <=>(other) puts "#{value} <=> #{other.value}" value <=> other.value end def succ puts "succ: #{value}" EvenNumber.new(value + 2) end end puts (EvenNumber.new(2)..EvenNumber.new(6)).include?(EvenNumber.new(5)) # 1.8.7 # 2 <=> 6 # 2 <=> 5 # 5 <=> 6 # true # buggy! # 1.9.1 # 2 <=> 6 # 2 <=> 6 # succ: 2 # 4 <=> 6 # succ: 4 # 6 <=> 6 # false # correct! |
This makes sense for the conceptual range, but has a performance impact especially on large ranges. #include? has gone from O(1) to O(N). This is most likely to crop up when checking time ranges – Time#succ returns a time one second in the future.
1 2 3 4 5 6 |
(Time.utc(1999)..Time.utc(2001)).include?(2000) # 1.8.7 # true # 1.9.1 # Don't wait for this to finish... |
Workarounds
Ruby 1.9 introduces a new method Range#cover? that implements the old include? behaviour, however this method isn’t available in 1.8.7.
1 2 3 4 5 6 7 8 9 |
puts (EvenNumber.new(2)..EvenNumber.new(6)).cover?(EvenNumber.new(5)) # 1.8.7 # undefined method `cover?' for #<struct EvenNumber value=2>..#<struct EvenNumber value=6> (NoMethodError) # 1.9.1 # 2 <=> 6 # 2 <=> 5 # 5 <=> 6 # true |
Another alternative, if it makes sense for your range, is to define the to_int method, which ruby will use to do a straight comparison against your min/max values.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
class EvenNumber < Struct.new(:value) # ... as before def to_int value end end puts (EvenNumber.new(2)..EvenNumber.new(6)).include?(EvenNumber.new(5)) # 1.8.6 and 1.9.1 # 2 <=> 6 # 2 <=> 5 # 5 <=> 6 # true |
Personally, I’ve monkey-patched range in 1.8.* to alias cover? to include?. That’s it. May your test suites not appear to hang.
Faster rails testing with ruby_fork
A long running test suite isn’t the problem. Your build server can take care of that. A second or two here or there, no one notices.
The killer wait is in the red/green/refactor loop. You’re only running one or two tests, and an extra second can mean the difference between getting into flow or switching to twitter. And you know what kills you in rails?
1 2 3 4 5 |
$ time ruby -e '' -r config/environment.rb real 0m3.784s user 0m2.707s sys 0m0.687s |
Yep, the environment. That’s a lot of overhead to be waiting for everytime you run a test, especially since it’s the same code every time! You fix this with a clever script called ruby_fork that’s included in the ZenTest package. It loads up your environment, then just chills out, waiting. You send a ruby file to it, and it forks itself (the process containing the environment) to execute that file. The beauty of this is that forking is really quick, and it leaves a pristine copy of the environment around for the next test run.
‘Environment’ doesn’t just have be environment.rb, for bonus points you can load up test_helper.rb, which will also load your testing framework into memory. In fact, you can preload any ruby code at all – ruby_fork isn’t rails specific.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
$ ruby_fork -r test/test_helper.rb & /opt/local/bin/ruby_fork Running as PID 526 on 9084 $ time ruby_fork_client -r test/unit/your_test.rb Started ... Finished in 0.565636 seconds. # Aside: this time is bollocks 3 tests, 4 assertions, 0 failures, 0 errors real 0m0.972s # This is the time you're interested in user 0m0.225s sys 0m0.035s |
That’s fantastic, though you’ll notice in newer versions of rails your application code is not reloaded. By default your test environment caches classes – which normally isn’t a problem except that newer rails versions also eager load those classes (so they’re loaded when you load enviornment.rb). You can fix this by clearing out the eager load paths in your test environment file:
1 2 |
# config/environments/test.rb
config.eager_load_paths = []
|
On my machine this gets individual test runs down from about 4 seconds to less than 1 second. You can sell that to your boss as a four-fold productivity increase.
Testing Glue Code
db2s3 combines together 3 external dependencies – your database, the filesystem, and Amazon’s S3 service. It has 1 conditional in the main code path (and it’s not even an important one). The classic unit testing approach of “stub everything” provides little benefit.
Unit testing is good for ensuring complex code paths execute properly, that edge cases are properly explored, and for answering the question “what broke?”. For trivial glue code, none of these are of particular benefit. There are no complex code paths or edge cases, and it will be quickly obvious what broke. In fact, the most likely thing to “break” (or change) over time isn’t your code, but the external services it is sticking together, which stubs cannot protect you from. Considering the high relative cost of stubbing out all your dependencies, unit testing becomes an expensive way of testing something quite simple.
For glue code, integration tests are the best solution. Glue code needs to stick, and integration tests ensures that it does. Here is the only test that matters from db2s3:
1 2 3 4 5 6 7 8 9 |
it 'can save and restore a backup to S3' do db2s3 = DB2S3.new load_schema Person.create!(:name => "Baxter") db2s3.full_backup drop_schema db2s3.restore Person.find_by_name("Baxter").should_not be_nil end |
This test costs money to run since it hits the live S3 service, but only in the academic sense. The question you need to ask is “would I pay one cent to have confidence my backup solution works?”
Always remember why your are testing. Unit tests are a focussed tool, and not always necessary.
Backup MySQL to S3 with Rails
Here is some code I wrote over the weekend – db2s3. It’s a rails plugin that provides rake tasks for backing up your database and storing it on Amazon’s S3 cloud storage. S3 is a trivially cheap offsite backup solution – for small databases it costs about 4 cents per month, even if you’re sending full backups every hour.
There are many scripts around that do this already, but they fail to address the biggest actual problem. The aws-s3 gem provides a really nice ruby interface to S3, and dumping a backup then storing it really isn’t that hard. The real problem is that I really hate system administration. I want to spend as little time as possible and I want things to Just Work.
A script is great but there’s still too many things for me to do. Where does it go in my project? How do I set my credentials? How do I call it?
That’s why a plugin was needed. It’s as little work as possible for a rails developer to backup their database, so they can get back to making their app awesome.
db2s3. Check it out.
Singleton resource, pluralized controller fix for rails
map.resource still looks for a pluralized controller. This has always bugged me. Here’s a quick monkey patch to fix. Tested on rails 2.2.2.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
# config/initializers/singleton_resource_fix.rb module ActionController module Resources class SingletonResource < Resource #:nodoc: def initialize(entity, options) @singular = @plural = entity # options[:controller] ||= @singular.to_s.pluralize options[:controller] ||= @singular.to_s # This is the only line to change super end end end end |
1 2 3 4 5 6 |
# config/routes.rb # before fix map.resource :session, :controller => 'sessions' # after fix map.resource :session |
inject and collect with jQuery
You know, I would have thought someone had already made an enumerable plugin for jQuery. Maybe someone has. Mine is better.
- Complete coverage with screw-unit
- Interface so consistent with jQuery you’ll think it was core
1 2 3 4 |
squares = $([1,2,3]).collect(function () {
return this * this;
});
squares // => [1, 4, 9]
|
It’s on github. It deliberately doesn’t have the kitchen sink – fork and add methods you need, there’s enough code it should be obvious the correct way to do it.
As an aside, it’s really hard to spec these methods concisely. I consulted the rubyspec project and it turns out they had trouble as well, check out this all encompassing spec for inject: “Enumerable#inject: inject with argument takes a block with an accumulator (with argument as initial value) and the current element. Value of block becomes new accumulator”. Bit of a mouthful eh.
Post your improvements in the comments.
Code for Christmas
Developers don’t have enough time.
We’re all too busy working our day job, or looking after our better half, to give our pet projects the attention they deserve.
That makes time the most valuable thing we can give. This year for Christmas, why not give a fellow developer some?
Ticking off an amazon wishlist never really resonated with me, so this year here is what we are all doing instead:
- Find someone’s pet open source project – I’d start at github
- Contribute! It doesn’t have to be much – a spec or two, some documentation, or even just a “hey it works on my box”. Fork, commit, pull request.
- Wish them a Merry Christmas!
That shouldn’t take you more than an hour. It’s a total win all around – you get to hone your chops, they get some love on their project, and the open source ecosphere is improved. If you’re feeling generous, or don’t have any friends, there’s no shortage of projects that I’m sure would welcome some support.
My wishlist is any of the ruby midi projects out there.
Unique data in dm-sweatshop
dm-sweatshop is how you set up test data for your datamapper apps. Standard practice is to generate random data that follows a pattern:
1 2 3 4 5 |
User.fix {{ :login => /\w+/.gen }} new_user = User.gen |
Let’s not now debate whether or not random data in tests is a good idea. What’s more important is that the above code should make you uneasy if login is supposed to be unique. There was a hack in sweatshop that would try recreating the data if you had a uniqueness constraint on login and it was invalid, but it was exactly that: a hack. As of a few days ago (what will be 0.9.7), you need to be more explicit if you want unique data. It’s pretty easy:
1 2 3 4 5 |
include DataMapper::Sweatshop::Unique User.fix {{ :login => unique { /\w+/.gen } }} |
Tada! You can also easily get non-random unique data by providing a block with one parameter. Check the README for this and other cool things you can do.
Introducing SocialBeat (screencast)
Here is a screencast of socialbeat in which you will note:
- I don’t appear drunk
- I don’t reveal intra-company communications
- I show off the full gamut of socialbeat’s awesomeness in under 3 minutes
In these ways you may find it superior to other screencasts you may have seen on the matter.
Introducing SocialBeat
If you are behind the times – socialbeat is some code that lets you live code OpenGL visualizations to MIDI tracks.
Comparing lambdas in ruby
to_ruby is a really convenient way to compare the equality of two lambdas. It’s a bit slow though. If we get our hands dirty (only a little!) with ParseTree, we can get a result 2 orders of magnitude quicker. I’d be interested to see if these benchmarks differ significantly on other versions of ruby.
1 2 |
~ $ ruby -v ruby 1.8.6 (2007-09-23 patchlevel 110) [i686-darwin8.11.1] |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
require 'benchmark' require 'parse_tree' require 'ruby2ruby' def gen_lambda lambda {|x| x + 1 } end Parser = ParseTree.new(false) # This only requires parse tree, not ruby2ruby def proc_identity(block) klass = Class.new name = "myproc" klass.send(:define_method, name, &block) # .last ignores the method name and definition - they're irrelevant Parser.parse_tree_for_method(klass, name).last end n = 1000 Benchmark.bmbm do |x| x.report("#to_ruby") { n.times { gen_lambda.to_ruby == gen_lambda.to_ruby }} x.report("#to_sexp") { n.times { gen_lambda.to_sexp == gen_lambda.to_sexp }} x.report("manual") { n.times { proc_identity(gen_lambda) == proc_identity(gen_lambda) }} end |
1 2 3 4 |
user system total real #to_ruby 4.460000 0.220000 4.680000 ( 4.695327) #to_sexp 0.920000 0.190000 1.110000 ( 1.110214) manual 0.030000 0.000000 0.030000 ( 0.032768) |
In case you were wondering, I was playing around with this while implementing unique data generation for dm-sweatshop