Testing HTTP Authentication

Posted by Luke Francl
on Tuesday, June 30

If you ever need to test HTTP Authentication in your functional tests, here is how you do it:

1
2
3
4
5
6
def test_http_auth
  @request.env['HTTP_AUTHORIZATION'] = ActionController::HttpAuthentication::Basic.encode_credentials("quentin", "password")
  get :show, :id => @foobar.id

  assert_response :success
end

This is much like testing SSL.

Hat tip: Philipp Führer for Functional test for HTTP Basic Authentication in Rails 2.

Adding Routes for Tests

Posted by Luke Francl
on Monday, June 22

I like to be extremely judicious with use of routes. Fewer routes means less memory consumption and fewer confusing magical methods.

I always delete the default route map.connect ':controller/:action/:id' (you should too, otherwise all your pretty RESTful routing is easily circumvented). Since Rails now has the ability to remove unneeded RESTful routes, I’ve been removing those, too.

However, this judiciousness recently painted me into a corner. I have a controller action that I would like to test and it’s wired up like this:

map.logout '/logout', :controller => 'user_sessions', :action => 'destroy', :method => 'delete'

I don’t have this mapped any other way, because why should I?

1
2
3
4
5
6
7
8
def test_logout_should_redirect_to_root_path
  UserSession.create(User.first)

  delete :destroy

  assert_match /logged out/, flash[:notice]
  assert_redirected_to root_path
end

Unfortunately, the test fails with ActionController::RoutingError: No route matches {:action=>"destroy", :controller=>"user_sessions"}! Huh?

The problem is that the delete (and get, post, etc.) method can’t find the route that I created.

Initially, I worked around this using with_routing to define a whole new set of routes just for that test.

1
2
3
4
5
6
7
8
9
10
11
with_routing do |set|
  set.draw do |map|
    map.resource :user_sessions, :only => [:destroy]
    map.root :controller => 'foobars', :action => 'index'
  end

  delete :destroy

  assert_match /logged out/, flash[:notice]
  assert_redirected_to root_path
end

But that was annoying. And after I had more than one route exhibiting this problem, it got really annoying.

Fortunately, I found Sam Ruby’s post Keeping Up With Rails about the challenge of Rails’ minor, quasi-documented API changes. Sam’s post has a bit about how you can add new routes without clearing the existing routes in Rails 2.3.2, which I knew was possible. Following Sam’s link to the commit (there’s no docs for this) showed how to do it.

Now, I’ve added this to test_helper.rb:

1
2
3
4
class ActionController::TestCase
  # add a catch-all route for the tests only.
 ActionController::Routing::Routes.draw { |map| map.connect ':controller/:action/:id' }
end

The downside to this is that real problems with broken routes may get swept under the rug. You could be more restrictive with the routes you are adding just for tests to overcome that problem.

Update: Thanks to Adam Cigánek in the comments for pointing out my error in why the route didn’t get picked up in the tests. I had the condition hash wrong!

Instead of:

map.logout '/logout', :controller => 'user_sessions', :action => 'destroy', :method => 'delete'

It should be:

map.logout '/logout', :controller => 'user_sessions', :action => 'destroy', :conditions => {:method => :delete}

The first way I had worked correctly when testing manually, but only because without :method, the route responds to all HTTP methods (still no clue why my test didn’t pick it up, though).

Interestingly enough, there’s another gotcha here. Notice that I specified :method => 'delete'. Even when put into the :conditions hash, that doesn’t work. You MUST pass a symbol (:delete) for the HTTP method.

This fixed my problem, but if I ever do need to add routes for tests, now I know how…

JavaScript gotcha: storing objects in an associative array

Posted by Luke Francl
on Wednesday, June 17

I just ran into a tricky gotcha in JavaScript.

I was trying to store some objects in an associative array. Based on my experience with Java, Ruby, and other languages, I expected that given code like this:

1
2
3
4
5
6
7
var dictionary = {};

var obj1 = {}; 
var obj2 = {};

dictionary[obj1] = 'foo'
dictionary[obj2] = 'bar'

The result of dictionary[obj1] would be ‘foo’ and dictionary[obj2] would be ‘bar’.

This is not the case!

The problem is that JavaScript objects are not really hash tables. They’re associative arrays, and the key can only be a String. When you insert an object into a associative array, toString() is called and that is used as the key. Unfortunately, the default toString implementation for JavaScript objects returns “[object Object]”...which is not only very unhelpful when debugging, but doesn’t provide you with a unique key for your associative array.

You can work around this problem by overriding toString. Or you can figure out another way to associate your object with a value. D’oh!

Sprinkle: the provisioning tool for people who don't have huge server clusters

Posted by Luke Francl
on Thursday, June 04

I’ve recently been trying to find a good server automation tool that meets my needs. I looked at Chef and Puppet.

They are both awesome for what they do, but what I don’t like is all the infrastructure I have to maintain to run Chef or Puppet. You need a server to host your server configuration on. But I only have one server![1] Chef does have a solo version which can download configuration from a web server and run it. That’s cool, but I don’t want to have a web server just for putting server configuration on.

When the time commitment to set up one of these tools up greatly exceeds how long it is for me to bring up a new slice and run through the standard Apache/DB/Passenger stack, I lose interest. In the end, these are great tools for managing a cluster of machines and bringing up a new app in the cluster quickly—and keeping it up to date automatically. If you have big infrastructure needs, they make sense. If you just want to set up a single slice…ugh.

After reading a bit about how Puppet and Chef work, what I really wanted was the ability to push server provisioning recipes. I want to maintain the server config in my repository and then provision a new server with a command I run on my machine. Sort of like deprec, but understandable.

Fortunately, I found Sprinkle and passenger-stack.

Sprinkle lets me quickly define which packages I want installed and push it out to a server to run (via Capistrano, Vlad, or Net::SSH). Sprinkle makes it easy to install software using apt, gem, or source. And unlike a simple shell script, Sprinkle tests whether or not the software is installed before running, and has a concept of dependencies.

Passenger-stack removes the pain of writing my own rules for what to install. It comes with the standard stuff you’d need, and you can customize it from there.

Here’s how you install all the software you need for a fresh server, after downloading passenger-stack:

sprinkle -c -s config/install.rb

The best part is that you can run that command again, and it won’t do anything. So you can add new software to your stack, then run it against your server, and only the new software will get installed.

This gives you a great way to manage natively compiled gems and ensure that if you ever need to spin up a staging server or a demo server, everything you need gets installed.

Check out this screencast by Ben Schwartz, author of passenger-stack.

Passenger-stack demo from Ben Schwarz on Vimeo.

It’s not a smart as Chef and Puppet. It’s not transactional and servers don’t check for new software to install automatically. But it sure is easy. That’s why I call Sprinkle “the provisioning tool for people who don’t have huge server clusters.”

1 Basically. I have many servers with many different applications on them. And I have a few servers that have multiple environments, but the same software. That’s my big driver for wanting a provisioning tool.

Estimating software: a rule of thumb

Posted by Jon
on Tuesday, June 02

Estimating software is hard, but most of us have to do it – whether we’re estimating an entire project for a client, or a new feature for a boss, or a change to one of our own projects.

I’ve found the following rule helpful when estimating software. This comes from about four years of estimating Rails projects to consulting clients, and moving from bad – dramatically underestimating fixed-bid projects – to pretty good – usually overestimating time & materials projects slightly. (And more importantly, knowing when I can’t estimate, because the scope is too vague or too large.)

Jon’s Law of Estimates

Software difficulty is primarily determined by volume, logic, and integration.

Jon’s Law of Estimates, explained

1. Volume is easy to understand. If you’re building software that does more, it will require more work. So if you’re estimating a project that stores recipes, and you’re estimating another project that stores recipes AND shopping lists, you can expect that the second one will take more work (if everything else is equal).

2. Logic refers to the rules or business logic behind a feature. The more rules there are, the more work there is. Imagine that our recipe system requires that recipes from some users are manually approved by an administrator, and checks to see that each ingredient in the recipe is present in the step-by-step instructions, and only allows a user to post 3 recipes per hour, and lets users propose alternative versions of a recipe, and lets an alternative version replace the regular version if it achieves a certain rating, etc. That’s more work than a recipe system that just lets users create and rate recipes, even though the volume of features may not be any larger.

Interestingly, a technology can make some logic trivial and some logic hard. Nested forms are a great example of this. Before Rails 2.3, Rails made it trivial to do CRUD on a single table at a time, but difficult handle multiple tables. Now it is (almost) trivial to do CRUD on multiple tables at a time.

3. Integration points are usually deserving of special consideration in an estimate. This includes talking to a web services API, another local software system, a data feed, a complex library, etc. Not only do integration points often take time to get right, but they can become sinkholes of time when the documentation is inadequate or incorrect, the other system doesn’t play nice, or you can’t easily test the integration. And your estimate depends on something out of your control: the other system.

External factors

These rules only apply to the difficulty of the software. Several external factors are important as well. These include, most notably, the client and the team. The client can make a project easy, or they can make a project difficult. Similarly, the right team might be able to blaze through a project quickly, while the wrong team may never finish at all.

The other side of estimating

Here’s the thing about these rules: they’re relative, not absolute. There is no rule that says “Features take 5 days, and integration points take 10”. So estimating requires comparisons. This means that if you’ve never built a Rails app before, you’ll have trouble estimating a Rails project. But once you’ve built a few, you can compare the volume, logic, and integration points of a new project to volume, logic, and integration points of the previous ones.

So estimating requires intuition and experience as well as analysis (e.g. Jon’s Law of Estimates). The key to estimating is to combine analysis and intuition, and to let each side refine the other.

Announcing VeloTweets, Pulse of the Peloton

Posted by Luke Francl
on Sunday, May 10

I’m pleased to announce VeloTweets, the pulse of the peloton, a curated collection of professional cycling Twitter activity. The idea and driving force came from Jamie Thingelstad. I did most of the development, and Norm Orstad designed the site. Chris Hatch helped a lot on the back end, providing a list of cyclists on Twitter, filling out profiles and affiliations, and doing research.

What’s Different about VeloTweets?

We wanted to make VeloTweets different than the other subject matter aggregators out there. We wanted a hook that would combine the immediacy of Twitter with pro cycling in a compelling way.

Here’s what we came up with.

First, we focused on who to include. Instead of everyone who’s talking about cycling, this contains only pro cyclists (and a few others associated with the sport, like managers or team mechanics).

Second, we extended the data that is given to us by Twitter. We can enter every cyclist’s real name, nationality, and team, as well as expanded biographical data (here’s Lance Armstrong’s profile for instance).

Third, we collected cycling events in a calendar that’s displayed on the site, and added a Message of the Day that’s tuned to what’s happening in the racing world each day.

Forth, we brought in photos from the tweets (only TwitPic is supported right now). We store references to the photos in our DB so we can show the latest photos, along with photos that individuals have posted, and all of them. This turns out to be really cool because where else are you going to see photos like this one as they happen?

After all this we still weren’t totally satisfied with what we’d come up with, because it still looked too much like Twitter (long list of messages in reverse chronological order). Then Jamie came up with the idea of only displaying each cyclist’s most recent tweet in a grid. We really like how this works because people who tweet a lot (like Lance) don’t dominate the page. It gives you an overview of what the whole peloton is talking about without letting a few people dominate it.

Behind the scenes

This application uses Rails 2.3, the Suspenders base app, make_resourceful, semantic_form_builder and the excellent HTTPClient library for interacting with Twitter (give up on net/http – it is full of fail).

Twitter API access is done directly with JSON. We pull the friends_timeline and insert those tweets into the database.

Developing for Twitter

I’ve been doing a number of Twitter-related projects lately. The first was Twistr, which combines Twitter and Flickr LOLcat style for occasionally amusing results. Then Barry Hess and I built Follow Cost, which tells you how much someone tweets before you follow them. I created a prototype for FanChatter’s next product based on Twitter conversation aggregation. Now comes VeloTweets and another project that’s not public yet.

I really enjoy working with the Twitter APIs. It’s fun to develop applications that utilize the platform that the Twitter folks have built.

On that front, I recently received a copy of Twitter API: Up and Running (Follow Cost is mentioned on page 70!) which I will give a full review to soon. You don’t need a book on the Twitter API to develop applications for it, but it does provide some ideas and a useful reference, as well as details on some interesting aspects of Twitter (for example, I did not know that direct messages disappear if they are deleted by either party.).

Music and programming: interviews with Chad Fowler and Dave Thomas

Posted by Jon
on Thursday, April 30

I’ll be speaking at RailsConf 2009 this year on music and software development (Five musical patterns for programmers). The basic premise is that software development and music actually have quite a bit in common. This may be surprising to some people, who see programming as a cold, rational left-brain sort of thing, like science. But we programmers know that this is not really the case at all.

So as a prelude to my talk, I decided to interview two programmer-musicians on the subject: Chad Fowler and Dave Thomas. Both compose and perform music, and both are noted programmers. Here is the interview.

Rail Spikes: Tell us a little about your background with both programming and music.

Chad Fowler: I started my professional life as a saxophonist in Memphis. I played the Beale street clubs and all the typical Memphis professional musician stuff. Among others, I played for a while with Ann Peebles and her husband Don Bryant with the rhythm section from all the old Hi Records recordings. I did mostly R&B and jazz professionally but I was probably most well known in the Memphis community for making “strange” music. Before playing music professionally, I played guitar in punk bands in high school. I was a fan of punk, heavy metal, hip hop, pop, (new) classical and pretty much everything else. As I immersed myself in the world of jazz, it became quickly clear that the jazz community doesn’t like punk and other less “serious” types of music and has an almost religious negative reaction to jazz musicians who do.

It was almost as if any deviation from the “normal” world of jazz made you a traitor. So I did the natural thing: started a group called The Jazz Traitors, which played music that 1) we loved and 2) offended the jazz community (not necessarily in that order).

I was also very interested in composing “classical” music. I studied with a composer named Kamran Ince, who is still my favorite such composer.

As for programming, I’ve been interested in programming since I was a young child using my commodore 64. I wasn’t really that good at it as a kid but I played around a lot. I didn’t get serious until I picked up programming again as a hobby while I was a professional musician. After a late night gig at a bar, it was relaxing to go home and unwind to some C programming tutorials. I didn’t have a need to program, nor did I have a project in mind (except that I have always loved video games and wanted to learn how they worked). But I got so into it, that I ended up getting a job in computer support because a friend filled out an application for me.

Being the gamer I am, as soon as I started in computer support, I naturally wanted to “level up”. That meant becoming a network administrator. Then a system administrator. Then a programmer, then a designer, then an architect, then a CTO, etc. Now here I am. It’s been fun.

Dave Thomas: There was always a lot of music in our house. My father liked to play the piano and the organ (I learned to solder as he built a Heathkit organ from a kit in the late 60s). My mother liked Broadway musicals. So we’d often experience alternating hours of Chopin and South Pacific. My brother was also musical. I wasn’t particularly, but I enjoyed noodling on the piano, and spent hours just playing with chords and progressions.

I’ve been programming since I was 15 or so.

Rail Spikes: Some developers – yourself included – have suggested a similarity between programming and music composition or performance. How exactly are music and programming similar?

Dave Thomas: I’m not sure, but I think it might be something to do with the discovery of patterns. Both music and code consist of nested sets of variations and repetitions. There’s a rythm to executing code, in the same way there’s a rythm to music. It is never exact, but it’s there. After a while, I found I could imagine the rythm and structure of my programs as they run, in the same way you can pick apart the structure of a piece of music as you listen to it. And, jsut as with music, it takes experience to be able to feel the deeper structures and notice the more extreme variations. But being able to spot them in programs makes coding simpler and more interesting. The basic coding structures—loops, method calls, and so on—provide the framework for composing in the same way that staff and bar lines do for music. Algorithms are like the progressions, and data becomes the notes. And in the same way that good music takes all these things and then surprises you, good code does the same thing. It isn’t mechanical and repetitive: instead it uses the constraints to build something bigger and more interesting.

Chad Fowler: It’s hard for me to put my finger on. There’s something similar in the way I think when I do each.

I think it all boils down to language, though. In all of these cases (including learning actual language), you take a bunch of tokens (notes, sounds, grunts, functions, classes) and combine them into a grammar which you use to express ideas. The way you do that is totally up to you as long as the intended ideas are communicated. With computer programs, they have to do what they’re meant to do. With music, they express or evoke emotions, paint pictures, cause anxiety or whatever.

Some computer programs evoke emotions and cause anxiety as well.

Rail Spikes: Is Ruby development more like improvised jazz or composed classical music?

Chad Fowler: I think it’s both. And I don’t think Ruby is any different in this than other languages. Much of the discussion about the relationship between programming and music focuses on the more obvious idea of programming as composition. It makes sense, since programmers tend to sit and type their ideas into an editor and then eventually execute it. The programs can be checked, tested, refactored, etc. before the actual performance. This is how classical composition works as well.

But the less obvious angle is that in many situations, programming is like performance. In fact, even in music, improvisation is really just real time composition. You don’t get a chance to refactor because your “code” is executed as you write it.

I’ve had this same feeling while debugging production problems, hacking new features on a tight deadline, or sometimes during the initial creation of an application. The same synapses are firing as when I was trying to play Cherokee at 200 beats per minute. Mistakes can’t be erased, so they have to be nuanced into (worst case) insignificant events or (best case) important drivers behind the work.

From a purely development-oriented perspective, TDD is more like improvisation than composition. I think that’s what I like about it. It’s motivating and creative in an exciting, time-sensitive way. You take small steps and see where they lead you. Sure, you can always revert your changes if you paint yourself into a corner but part of the fun and challenge is to not paint yourself into a corner.

One thing jazz musicians like to say is that every wrong note is just a half step away from a right note. TDD is like that. You might take a slightly wrong turn. It’s fun to see if you can course-correct without starting over.

Rail Spikes: Do developers need to be musically inclined? Does it help?

Chad Fowler: Obviously not. Some of the best programmers I know are not musicians. I can’t tell if it helps, but I would guess that developers who are also musicians are different than developers who aren’t. I don’t think that’s because being a musician changes people, though. I think it’s because the people who are both are the kind of people who need to do both.

This usually means they’re “right brain” people. This leads to a way of thinking that changes how they approach programming problems.

I think learning music (or another right brain discipline) is a good way to exercise your mind. So I wouldn’t be surprised if leaning music helps people exercise their thought processes in ways that will benefit their work as programmers (or authors, or lawyers, or doctors or whatever).

I also think, though, that if we were all musicians at heart, we wouldn’t get much done. I rely heavily on my less artsy colleagues to ground me and be sometimes more pragmatic than I am. So I don’t think we all need to be a “right brain” programmer. It would be disasterous if we were.

Dave Thomas: Do they need to be? No. But many of the good ones I know are. I’d guess that density of musicians in software development is many times the population norm. But that means you could also ask the question “Do musicians have to know software development?”

I think the more interesting question is to ask “how can people best express what they enjoy doing?” because both music and software development are outlets for this.

Rail Spikes: What sort of music do you listen to? Any recommendations for Ruby developers looking to expand their musical horizons?

Chad Fowler: As I mentioned earlier, I like all kinds of music (with a few exceptions). Lately I’ve been listening to a lot of instrumental hip hop, such as DJ Qbert and Mixmaster Mike. I’ve also been getting into a genre of electronic music called “electro”, which sounds like the bleeps and bloops that are the soundtrack of my dreams (if a computer is going to generate music I always like it to sound like a computer generated it).

As for recommendations, here are a few ideas for things that most developers probably haven’t listened to:

  • Kamran Ince – He was my composition teacher and, I think, an accessible introduction to the world of “new music”, which is what we call new composed “classical” music. The term “classical” is a widely spread misnomer. It actually refers to music written in the late 18th and early 19th centuries, but most people use it to mean high brow music written for instruments like violins. So whatever you call it, Kamran Ince writes some beautiful instances of it. Specifically check out his chamber music, such as Domes and Arches.
  • Charlie Wood – I have had the pleasure of playing with Charlie on a few occasions. He is a R&B singer/organist/composer from Memphis and writes some of the most intelligent songs you’ll hear. My favorite album of his is “Who I Am”.
  • John Zorn – Zorn has been around for a long time and is a leader in the world of Avant Garde music. He’s also one of the most amazing saxophonists ever. If you’re new to this kind of thing, his Masada quartet (“radical Jewish music”) produces some great stuff that’s accessible to first time listeners. If you’re looking for something to shock your aural taste buds, try Painkiller (metal-tinged noise) or Naked City.

Dave Thomas: I listen to just about anything that’s interesting. My playlist here is very varied, and I try to add new stuff to it farily regularly. I know people who are trained as musicians, and I tend to ask them what they’re listening to. Sometimes that leads to challenges: my ear isn’t as developed as their ears. But often it leads to whole new areas of cool stuff. So I’d recommend everyone should find a friend who knows more than you do about music and ask them to surprise and challenge you. (That advice probably applies to just about everything, thinking about it.) It’s easy to find music that stimulates your lizard brain. Get into the habit of looking for the stuff that engages at a higher level too. And, like everything, have fun with it.

Anonymize sensitive data with rake

Posted by Jon
on Wednesday, April 08

When troubleshooting a nasty bug, it’s often useful to take a look actual production or staging data, or even pull it down into your development database. But this is a huge potential privacy and security concern. Your local environment likely isn’t as secure as your production environment, and you might not want to access this sensitive data (or give it to another team member).

Similarly, you might want to replicate your production data on a staging or QA environment to see how new code will interact with real data. Also a privacy concern.

Simple solution: anonymize the data!

In my current project, I put together an anonymize.rake task to deal with this. The most sensitive data in our app is name and phone number. Without that, private information can’t really be linked back to someone. So I pulled the 200 most common first names and 1000 most common last names (in the United States) and put them into an Anonymizer class. Call Anonymizer.random_name for a random, but realistic, name. The class also includes a simple phone number and email anonymizer.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class Anonymizer
  def self.random_name
    "#{random_first_name} #{random_last_name}"
  end
  
  def self.random_first_name
    FIRSTNAMES[rand(FIRSTNAMES.size)]
  end
  
  def self.random_last_name
    LASTNAMES[rand(LASTNAMES.size)]
  end
  
  def self.random_phone
    "612-555-#{rand(8000) + 1000}"
  end
  
  FIRSTNAMES = %w(James
  John
  Robert
  Michael

  # etc.

The rake task is simple:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
namespace :db do
  namespace :data do
    desc "Anonymize sensitive information"
    task :anonymize => :environment do
      if RAILS_ENV == 'production'
        puts "Refusing to anonymize production data. You don't really want to do that."
      else
        puts "Anonymizing all name and email records in the #{RAILS_ENV} database."
        
        # User.find(:all).each do |user|
        # user.name = Anonymizer.random_name
        # user.email = Anonymizer.random_email(user.name)
        # puts "Saving #{user.name} (#{user.email})"
        # user.save!
        # end
      end
    end
  end
end

You’ll need to do the actual implementation yourself (see the sample User.all.each {} block). It would be easy enough to extend this to work with social security numbers, addresses, etc. Run with:

rake db:data:anonymize

Code: anonymize.rake

Benchmarking your Rails tests (updated)

Posted by Jon
on Friday, April 03

Update: stubbing a single integration point shaved 22 seconds off of my unit tests, reducing test time from 35 seconds to 13. See below.

The first step to faster tests is knowing what is slow. Fortunately, this is dead simple with the test_benchmark plugin by Tim Connor, and originally built by Geoffrey Groschenbach. Install the plugin, and when you run your tests via Rake, you’ll see handy output showing you the slowest tests, and the slowest test classes.

Step 1: Install the plugin.

script/plugin install git://github.com/timocratic/test_benchmark.git

Step 2: Run your tests

rake test

Here is a bit of output when I run the unit tests for FanChatter:

Finished in 34.838173 seconds.

Test Benchmark Times: Suite Totals:
25.393 MailReceiverTest
4.520 PhotoTest
1.429 REXMLTest
0.961 TeamTest
0.846 MessageTest

Pretty useful information. Almost 75% of our unit testing time is taken up in the MailReceiverTest. So if we want to speed up our tests, we need to make our MMS testing faster. Looking at that code, I see this line over and over:


MailReceiver.receive(fixture_mms(:fixture_name))

This method reads a test email message from the filesystem, and runs it through our mail parsing method. This is basically an integration test, hitting at least two integration points. So if we can remove these bottlenecks, we can reasonably expect a fairly large improvement in our unit test speed.

I think we could realistically reduce our unit testing time from 34 seconds to <15 seconds just by refactoring this one test method.

Other options

The test_benchmark plugin fires whenever you run your tests with rake. Tim recently patched the plugin to not fire when run with autotest, which is great. Personally, though, I don’t want to see this benchmark information every time I run my tests. So I added the following line to my test.rb environment file:

ENV['BENCHMARK'] ||= 'none'

Now, the benchmarks don’t run by default. If I want to see them, I call:

rake test BENCHMARK=true

And if to see full tests, showing the time it takes to run every test in the system, just call:

rake test BENCHMARK=full

That’s it. You still have to speed up your tests, and there are many ways to do that (from mocking to simply reducing the number of calls to expensive methods), but knowing what’s slow is half the battle.

The stirring conclusion (update)

I spent a few minutes optimizing these slow tests today. First, I tried rearranging the tests to reduce unnecessary calls to the slow method (MailReceiver.receive(message)). I was able to speed MailReceiverTest from about 25 seconds to 17. Not bad, but still slow.

The real problem is that this method saves a photo. It creates a Photo record that includes a file, treated sort of like an upload, like this:

1
photo.uploaded_data = mms.file

This is what was slow. But my unit tests don’t actually deal with the file being saved to the filesystem; they test other things, like the right records being created, confirmation emails being sent, etc.

So I decided to try bypassing this file save/upload by stubbing the uploaded_data= method. I put the following at the top of my test class:

1
2
3
def setup
    Photo.any_instance.stubs(:uploaded_data=)
  end

And voila! MailReciverTest went from 25 seconds to 17 seconds to 3 seconds.

10 Cool Things in Rails 2.3

Posted by Luke Francl
on Monday, March 30

This was presented to the Ruby Users of Minnesota on March 30, 2009.

Here’s a quick look at 10 new Rails features that I think are cool. Not all of them are huge new features, but instead help solve annoying problems. I’ve also created a simple application that demonstrates most of these features. You can get it at BitBucket

1. Rails Boots Faster in Development Mode

This is something all Rails developers can appreciate. In development mode, Rails now lazy loads as much as possible so that the server starts up much faster.

This is so fast, instead of replying on reloading (which doesn’t pick up changes to gems, lib directory, etc) one developer wrote a script (does anyone have the link for this?) that watches for file system changes and restarts your script/server process.

Using an empty Rails app, I got the following (totally non-scientific) real times for time script/server -d:

Rails 2.2: 1.461s
Rails 2.3: 0.869s

Presumably this difference would grow as more libraries were used, because Rails 2.3 will lazy load them. However I was too lazy to build up equivalent Rails 2.2 and 2.3 applications to try that out.

2. Rails Engines Officially Supported

Inspired by Merb’s slices implementation, Rails added official support for Engines, which are self-contained Rails apps that you can install into another application. Engines can have their own models, controllers, and views, and add their own routes.

Previously this was possible using the Engines plugin, but Engines would often break between Rails versions. Now that they are officially supported, this should be less frequent.

There are still some features from the unofficial Engines plugin that are not part of Rails core. You can read about that at the Rails Engines site.

3. Routing Improvements

RESTful routes now use less memory because formatted_* routes are no longer generated, resulting in a 50% memory savings.

Given this route:

map.resources :users

If you want to access the XML formatted version of a user resource, you would use:

user_path(123, :format => 'xml')

In Rails 2.3, :only and :except options to map.resources are not passed down to nested routes. The previous behavior was rather confusing so I think this is a good change.

1
2
3
4
map.resources :users, :only => [:index, :new, :create] do |user|
  # now will generate all the routes for hobbies
  user.resources :hobbies
end

4. JSON Improvements

ActiveSupport::JSON has been improved.

to_json will always quote keys now, per the JSON spec.

Before:

{123 => 'abc'}.to_json
=> '{123: "abc"}'

Now:

{123 => 'abc'}.to_json
=> '{"123": "abc"}'

Escaped Unicode characters will now be unescaped.

Before:

ActiveSupport::JSON.decode("{'hello': 'fa\\u00e7ade'}")
=> {"hello"=>"fa\\u00e7ade"}

Now:

ActiveSupport::JSON.decode("{'hello': 'fa\u00e7ade'}")
=> {"hello"=>"façade"}

See ticket 11000 for details.

5. Default scopes

Prior to Rails 2.3, if you executed a find without any options, you’d get the objects back unordered (technically, the database does not guarantee a particular ordering, but it would typically be by primary key, ascending).

Now, you can define the default sort and filtering options for finding models. The default scope works just like a named scope, but is used by default.

1
2
3
class User < ActiveRecord::Base
  default_scope :order => '`users`.name asc'
end

The default options can always be overridden using a custom finder.

User.all # will use default scope
User.all(:order => 'name desc') # will use passed in order option.

Example:

1
2
3
4
5
6
7
8
9
User.create(:name => 'George')
User.create(:name => 'Bob')
User.create(:name => 'Alice')

puts User.all.map { |u| "#{u.id} - #{u.name}" }

3 - Alice
2 - Bob
1 - George

Note how the default order is respected.

6. Nested Transactions

Pass :requires_new => true to ActiveRecord::Base.transaction and a nested transaction will be created.

1
2
3
4
5
6
7
User.transaction do
  user1 = User.create(:name => "Alice")

  User.transaction(:requires_new => true) do
    user2 = User.create(:name => "Bob")
   end
end

This is actually emulated using save points because most databases do not support nested transactions. Some databases (SQLite) don’t support either save points or nested transactions, so in that case this works just like Rails 2.2 where the inner transaction(s) have no effect and if there are any exceptions the entire transaction is rolled back.

7. Asset Host Objects

Since Rails 2.1, you could configure Rails to use an asset_host that was a Proc with two arguments, source and request.

For example, some browsers complain if an SSL request loads images from a non-secure source. To make sure SSL always loads from the same host, you could write this (from the documentation):

1
2
3
4
5
6
7
ActionController::Base.asset_host = Proc.new { |source, request|
  if request.ssl?
    "#{request.protocol}#{request.host_with_port}"
  else
    "#{request.protocol}assets.example.com"
  end
}

This works but it’s kind of messy and it’s difficult to implement complicated logic. Rails 2.3 allows you to implement the logic in an object that responds to call with one or two parameters, like the Proc.

The above Proc could be implemented like this:

1
2
3
4
5
6
7
8
9
10
11
class SslAssetHost
  def call(source, request)
    if request.ssl?
      "#{request.protocol}#{request.host_with_port}"
    else
      "#{request.protocol}assets.example.com"
    end
  end
end

ActionController::Base.asset_host = SslAssetHost.new

David Heinemeier Hansson has already created a better plugin that handles this case: asset-hosting-with-minimum-ssl. It takes into account the peculiarities of the different browsers to use SSL as little as possible, reducing load on your server.

8. Easily update Rails timestamp fields

If you’ve ever wanted to update Rails’ automatic timestamp fields created_at or updated_at you’ve noticed how painful it can be. Rails REALLY didn’t want you to change those fields.

Not any more!

Now you can easily change created_at and updated_at:

1
2
3
4

User.create(:name => "Alice", :created_at => 3.weeks.ago, :updated_at => 2.weeks.ago)

=> #<User id: 3, name: "Alice", created_at: "2009-03-08 00:06:58", updated_at: "2009-03-15 00:06:58">

Remember, If you don’t want your users changing these fields, you should make them attr_protected.

9. Nested Attributes and Forms

This greatly simplifies complex forms that deal with multiple objects.

First, nested attributes allow a parent object to delegate assignment to its child objects.

1
2
3
4
5
6
7
8
9
10

class User < ActiveRecord::Base
 has_many :hobbies, :dependent => :destroy

  accepts_nested_attributes_for :hobbies
end

User.create(:name => 'Stan', 
            :hobbies_attributes => [{:name => 'Water skiing'},
                                    {:name => 'Hiking'}])

Nicely, this will save the parent and its associated models together and if there are any errors, none of the objects will be saved.

Forms with complex objects are now straight-forward. To use this in your forms, use the FormBuilder instance’s fields_for method.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
<% form_for(@user) do |f| %>
  <div>
    <%= f.label :name, "User name:" %>
    <%= f.text_field :name %>
  </div>

  <div>
    <h2>Hobbies</h2>

    <% f.fields_for(:hobbies) do |hf| %>
      <div>
  <%= hf.label :name, "Hobby name:" %>
  <%= hf.text_field :name %>
      </div>
      <% end %>
  </div>

  <%= f.submit 'Create' %>
<% end %>

One catch is that a form is displayed for every associated object. New objects obviously have no associations so you have to create a dummy object in your controller.

1
2
3
4
5
6
7
8
9
10
class UsersController < ApplicationController
  def new
    # In this contrived example, I create 3 dummy objects so I'll get
    # 3 blank form fields.
    @user = User.new
    @user.hobbies.build
    @user.hobbies.build
    @user.hobbies.build
  end
end

There are a lot of options for nested forms including deleting associated objects, so be sure to read the documentation. Ryan Daigle also has a great write-up.

10. Rails Metal \m/

You can now write very simple Rack endpoints for highly trafficked routes, like an API. These are slotted in before Rails picks up the route.

A Metal endpoint is any class that conforms to the Rack spec (i.e., it has a call method that takes an environment and returns the an array of status code, headers, and content).

Put your class in app/metal (not generated by default). Return a 404 response code for any requests you don’t want to handle. These will get passed on to Rails.

There’s a generator you can use to create an example Metal end point:

script/generate metal classname

In my sample app, I have what I would consider the “minimally useful” Rails Metal endpoint. It responds to /users.js and returns the list of users as JSON.

1
2
3
4
5
6
7
8
9
10
11
class UsersApi
  def self.call(env)
    # if this path was /users.js, reply with the list of users
    if env['PATH_INFO'] =~ /^\/users.js/
      [200, {'Content-Type' => 'application/json'}, User.all.to_json]
    else
      # otherwise, bail out with a 404 and let Rails handle the request
      [404, {'Content-Type' => 'text/html'}, 'not found']
    end
  end
end

If you want a little bit more help, you can use any other Rack-based framework, for example Sinatra.

For more details on how Rails Metal works, check out Jesse Newland’s article about it.

Thanks for reading! For more details about new features in Rails 2.3, read the excellent release notes

Slow tests are a bug

Posted by Jon
on Tuesday, March 10

I’ve been doing TDD for about three years now. Once I figured out how to do it right, it became a natural part of how I program, and I can’t really imagine doing development without it. This isn’t to say that TDD is the only approach to writing quality software or that unit testing it the only kind of testing that matters. But it sure is useful.

The Ruby world talks a lot about TDD, moreso than many other developer communities. We have not one, not two, but at least half a dozen testing libraries that are actively being used and developed. For most Ruby developers, the question isn’t “Do you test?” but “BDD or TDD?” or even “RSpec, Shoulda, or Bacon?” We often use at least 2-3 layers of automated testing, and sometimes use different tools for each layer. Most Ruby conferences devote at least a few talks each day to testing-related topics. We’re test fanboys and -girls, for better or for worse.

But in spite of this, we rarely talk about test speed. Sure, there are purists who believe that unit tests shouldn’t touch the database because anything that touches the DB is actually an integration test. But few Ruby testers actually take this long and lonely road, and I personally prefer tests that talk to a database, at least some of the time.

And it’s true that others have written libraries to distribute their tests across multiple machines. But that’s the exception that proves the rule – the only reason to distribute your tests is that they’re too slow to begin with.

Most Rails projects I’ve worked on have ended up at around 3,000-15,000 lines of code, with a roughly as many lines of test code, and most have test suites that take a minute or more to run. Our test suite for Tumblon, for instance, churns along for 2.5 minutes. This is a too slow. And slow tests are a problem for at least two reasons: they slow down your development and decrease code quality.

1. Slow tests slow down development. If you’re practicing TDD, you want to see a test fail before you make it succeed. Two minutes is far too long for this feedback loop to be effective. Of course, you can (and should) just run the test classes that correspond to your code as you program – no need to run your entire test suite every time you write your failing tests. But even still, the test time bar should ideally be set quite low. Frequent 5-10 second delays are enough to break my concentration, and I find myself cmd-tabbing over to other programs if I have to wait more than a few seconds for a test to run. I don’t know of any hard-and-fast rules, but I know that as soon as my test suite runs longer than 30-45 seconds, and individual test classes take longer than 2-3 seconds, I’m less happy and less productive.

2. Slow tests decrease code quality. There are two simple reasons for this. First, if slow tests break your flow, you’re not only going to write code more slowly: you’re also going to write worse code. Second, if your tests are too slow, you’re not going to wait for them to finish before you move on to the next task. Or worse, you’re not going to run them at all.

So, how can I speed up my tests?

Fortunately, this problem can be addressed. There are plenty of ways to speed up tests. On a current project, we’ve managed to cut our test time substantially – a recent test refactoring cut test time from 129.45 seconds to 31.04 seconds, without removing any tests. That’s a 76% speedup. But we still have room for improvement.

Really quickly, here are at least five ways to speed up your test suite. I hope to post more on each of these over the next month or two.

1. Use a test database instead of fixtures/factories/etc.

2. Only touch the database when necessary

3. Organize your tests to avoid duplicate execution

4. Separate slow tests out into a lazier testing layer

5. Run a Rails test server

I’d love to see the Rails community devote more of its enthusiasm for testing to the question of test speed. There’s nothing wrong with improving our test frameworks, and let’s keep doing that. But let’s also make these frameworks fast.

Dealing with 'duplicate key violates unique constraint' on the primary key

Posted by Luke Francl
on Thursday, March 05

I recently had to work through a problem where inserts were failing due to duplicate primary keys.

Here’s the error (edited for clarity):

PGError: ERROR: duplicate key violates unique constraint "contracts_pkey": INSERT INTO "contracts" ('column_1', 'column_2', 'column_3') VALUES('abc', '123', 'xyz') RETURNING "id"

What is going on here? I’m not even providing the primary key id—that comes from the sequence.

Hey wait a second…

What was happening is that we had a data import that didn’t use the sequence. So the integers returned by the sequence have already been used, causing a duplicate primary key.

To solve the problem, I had to reset the sequence, like this:

select setval('contracts_id_seq', (select max(id) + 1 from contracts));

Today's hard-won lesson

Posted by Luke Francl
on Thursday, March 05

An float subtracted from an integer results in a float. When typecast by ActiveRecord, this is converted to an integer.

validates_numericality_of with :only_integer => true results in a rather obscure error message if a non-integer is present (“is not a number”).

validates_numericality_of uses the attribute value before type cast for its validation.

...

That means if you calculate a value before validation, what is printed out by using the attribute method is different than what validates_numericality_of is using for validation. You need to ensure that the value of attr_name_before_type_cast is an integer!

I just spent way too much time figuring this out.

Fetcher moved to GitHub

Posted by Luke Francl
on Sunday, February 15

A quick FYI for those who have been using the Fetcher plugin that we wrote (and use on FanChatter Events)...

I have moved the Fetcher plugin repository to GitHub. You can get it at git://github.com/look/fetcher.git

Happy forking!

(And a shameless plug for Mike Mondragon and my book: if you need more details about how to make your app speak email, look no further than Receiving Email with Ruby!)

Rescuing autotest from a conflicting plugin

Posted by Jon
on Saturday, February 14

For the longest time, I wasn’t able to run autotest on one of my projects. That was OK; I was intrigued by autotest, but had never really committed to it. The problem: whenever I would try to run autotest, I’d get the following error:


loading autotest/rails_rspec
Autotest style autotest/rails_rspec doesn't seem to exist. Aborting.

I’m running Shoulda, not RSpec, so I had no idea why this was happening. I tried installing (and uninstalling) RSpec in various configurations, to no avail. Nothing worked.

Then I started a new project. Autotest worked just fine on it. After a few days, I got used to autotest, and a few days later, I came to really like it. It helps me get into a TDD “flow” – all tests pass; write failing tests; write code; all tests pass.

So when I came back to my previous project where autotest didn’t work, I decided to dig deeper. Eventually I found a plugin that was causing the problem: acts-as-taggable-on. The plugin was written to allow autotesting, as explained in a blog post. Supposedly, this is supposed to be a different autotest instance from your app’s main instance, but it wasn’t working that way for me.

The fix? Delete lib/discover.rb from the acts-as-taggable-on plugin. That’s it – autotest works now.

In the end, I maybe could have solved the problem by getting RSpec configured properly, but just running the gem locally didn’t do the trick for me, and I don’t want to add any code to my app to support autotesting of a plugin that I never want to test.

So should plugins even ship with test code? Yes, they should. Not for normal use; I never run plugin tests, assuming instead that the plugin is tested by the author. But if an open source plugin ships without tests, it’s that much harder for other developers to fork/fix/improve the plugin. But really, that’s about the only reason for plugin/gem tests. And they should never touch application tests.