validates_length_of byte counting gotcha

Posted by Luke Francl
on Sunday, July 19

Watch out for validates_length_of if you need to make sure a string is a certain number of bytes long. For example, SMS messages can be no longer than 160 bytes in length. I recently got bit by this because some unicode “curly” quotes slipped into a reply message, but they weren’t detected by the validation.

Here’s the problem.

Consider this string:

str = "€"

It is 3 bytes long:

str.size => 3

However, ActiveRecord’s validates_length_of records this as only one character, because it uses str.split(//).size to measure the size.

If you NEED to be certain that a string is less than a certain number of bytes, you’ll need to override the default behavior of validate_length_of.

Fortunately, you can supply your own tokenizer, which makes this easy. The tokenizer is called, and size is called on its return value to find out how many tokens there are. Since String responds to size which returns the number of bytes, you can simply return the attribute value itself as the tokenizer, like this:

validates_length_of :message, :maximum => 160, :tokenizer => lambda { |str| str }

Will this still work in Ruby 1.9? I’m not sure. I now have a test case which will warn me if it doesn’t…

Testing HTTP Authentication

Posted by Luke Francl
on Tuesday, June 30

If you ever need to test HTTP Authentication in your functional tests, here is how you do it:

1
2
3
4
5
6
def test_http_auth
  @request.env['HTTP_AUTHORIZATION'] = ActionController::HttpAuthentication::Basic.encode_credentials("quentin", "password")
  get :show, :id => @foobar.id

  assert_response :success
end

This is much like testing SSL.

Hat tip: Philipp Führer for Functional test for HTTP Basic Authentication in Rails 2.

Adding Routes for Tests

Posted by Luke Francl
on Monday, June 22

I like to be extremely judicious with use of routes. Fewer routes means less memory consumption and fewer confusing magical methods.

I always delete the default route map.connect ':controller/:action/:id' (you should too, otherwise all your pretty RESTful routing is easily circumvented). Since Rails now has the ability to remove unneeded RESTful routes, I’ve been removing those, too.

However, this judiciousness recently painted me into a corner. I have a controller action that I would like to test and it’s wired up like this:

map.logout '/logout', :controller => 'user_sessions', :action => 'destroy', :method => 'delete'

I don’t have this mapped any other way, because why should I?

1
2
3
4
5
6
7
8
def test_logout_should_redirect_to_root_path
  UserSession.create(User.first)

  delete :destroy

  assert_match /logged out/, flash[:notice]
  assert_redirected_to root_path
end

Unfortunately, the test fails with ActionController::RoutingError: No route matches {:action=>"destroy", :controller=>"user_sessions"}! Huh?

The problem is that the delete (and get, post, etc.) method can’t find the route that I created.

Initially, I worked around this using with_routing to define a whole new set of routes just for that test.

1
2
3
4
5
6
7
8
9
10
11
with_routing do |set|
  set.draw do |map|
    map.resource :user_sessions, :only => [:destroy]
    map.root :controller => 'foobars', :action => 'index'
  end

  delete :destroy

  assert_match /logged out/, flash[:notice]
  assert_redirected_to root_path
end

But that was annoying. And after I had more than one route exhibiting this problem, it got really annoying.

Fortunately, I found Sam Ruby’s post Keeping Up With Rails about the challenge of Rails’ minor, quasi-documented API changes. Sam’s post has a bit about how you can add new routes without clearing the existing routes in Rails 2.3.2, which I knew was possible. Following Sam’s link to the commit (there’s no docs for this) showed how to do it.

Now, I’ve added this to test_helper.rb:

1
2
3
4
class ActionController::TestCase
  # add a catch-all route for the tests only.
 ActionController::Routing::Routes.draw { |map| map.connect ':controller/:action/:id' }
end

The downside to this is that real problems with broken routes may get swept under the rug. You could be more restrictive with the routes you are adding just for tests to overcome that problem.

Update: Thanks to Adam Cigánek in the comments for pointing out my error in why the route didn’t get picked up in the tests. I had the condition hash wrong!

Instead of:

map.logout '/logout', :controller => 'user_sessions', :action => 'destroy', :method => 'delete'

It should be:

map.logout '/logout', :controller => 'user_sessions', :action => 'destroy', :conditions => {:method => :delete}

The first way I had worked correctly when testing manually, but only because without :method, the route responds to all HTTP methods (still no clue why my test didn’t pick it up, though).

Interestingly enough, there’s another gotcha here. Notice that I specified :method => 'delete'. Even when put into the :conditions hash, that doesn’t work. You MUST pass a symbol (:delete) for the HTTP method.

This fixed my problem, but if I ever do need to add routes for tests, now I know how…

JavaScript gotcha: storing objects in an associative array

Posted by Luke Francl
on Wednesday, June 17

I just ran into a tricky gotcha in JavaScript.

I was trying to store some objects in an associative array. Based on my experience with Java, Ruby, and other languages, I expected that given code like this:

1
2
3
4
5
6
7
var dictionary = {};

var obj1 = {}; 
var obj2 = {};

dictionary[obj1] = 'foo'
dictionary[obj2] = 'bar'

The result of dictionary[obj1] would be ‘foo’ and dictionary[obj2] would be ‘bar’.

This is not the case!

The problem is that JavaScript objects are not really hash tables. They’re associative arrays, and the key can only be a String. When you insert an object into a associative array, toString() is called and that is used as the key. Unfortunately, the default toString implementation for JavaScript objects returns “[object Object]”...which is not only very unhelpful when debugging, but doesn’t provide you with a unique key for your associative array.

You can work around this problem by overriding toString. Or you can figure out another way to associate your object with a value. D’oh!

Today's hard-won lesson

Posted by Luke Francl
on Thursday, March 05

An float subtracted from an integer results in a float. When typecast by ActiveRecord, this is converted to an integer.

validates_numericality_of with :only_integer => true results in a rather obscure error message if a non-integer is present (“is not a number”).

validates_numericality_of uses the attribute value before type cast for its validation.

...

That means if you calculate a value before validation, what is printed out by using the attribute method is different than what validates_numericality_of is using for validation. You need to ensure that the value of attr_name_before_type_cast is an integer!

I just spent way too much time figuring this out.

Finding CSS problems with binary search

Posted by Luke Francl
on Friday, October 03

I’m trying to integrate the YUI Rich Text Editor in to an application which has some CSS rules that aren’t playing nice.

But how to find what’s causing the problem in a 1700 line CSS file?

Binary search to the rescue!

Binary search (see also this introduction from Princeton) is a divide-and-conquer algorithm for finding an item in a sorted list. On each iteration, it cuts the total search space in half.

Binary search diagram

It can also be used as a heuristic tool for finding software defects. The bisect command in git and Mercurial allows you to do a binary search through your code to find the revision where a bug was introduced.

Lately, I’ve been using it to find CSS problems.

Here’s what I do. I select half the file (roughly—obviously, you need to break it after a CSS expression), and delete it. Then I reload the page.

  • Problem still there? I know the problem is in the half I didn’t delete.
  • Problem fixed? I know the problem is in the half that I just deleted.

Restore the file, and delete the next subdivision. Within just a few iterations, you should be on a specific CSS rule that’s causing problems.

Anyway, maybe it’s old hat to most people, but I thought it was kind of cool.

Publishing non-ActiveRecord objects in an Atom feed with Rails

Posted by Luke Francl
on Wednesday, October 01

Rails comes with a method called atom_feed that makes it really easy to publish an Atom feed for a list of ActiveRecord objects.

Here’s an example from the documentation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
atom_feed do |feed|
  feed.title("My great blog!")
    feed.updated((@posts.first.created_at))

    for post in @posts
      feed.entry(post) do |entry|
        entry.title(post.title)
        entry.content(post.body, :type => 'html')

        entry.author do |author|
          author.name("DHH")
        end
      end
    end
  end
end

For each <entry>, the Atom elements for <id>, <published>, <updated>, and <link> are automatically assigned based on the methods id, created_at, updated_at and polymorphic_url respectively.

However, I wanted to publish a non-ActiveRecord object in the feeds for Tumblon. Specifically, a group of photos.

multi-photo post example

Each of these elements can be specified directly, except <id>. So you could do something like this:

1
2
3
4
5
6
7
8
feed.entry(post, :url => my_custom_url, :published => post.some_date, :updated => post.some_other_date) do |entry|
  entry.title(post.title)
  entry.content(post.body, :type => 'html')

  entry.author do |author|
    author.name("DHH")
  end
end

The big problem with this is if you’re using a non-ActiveRecord object, it won’t have a stable id, and so the Atom feed’s <id> will change every time the feed is generated—leading to duplicate posts in feed readers. So you need to conjure up a stable id for your non-ActiveRecord object.

In my case, I created an stable id using my list of photos. I also added created_at and updated_at methods so I don’t have to specify them when creating the feed.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class MultiPhotoPost
  # truncated ...
  
  # override id so that the atom feed <id> doesn't change between runs
  def id
    created_at.to_i
  end
  
  def created_at
    @created_at ||= photos.min { |a, b| a.created_at <=> b.created_at }.created_at
  end
  
  def updated_at
    @updated_at ||= photos.max { |a, b| a.updated_at <=> b.updated_at }.updated_at
  end

end

With this in place, you’ll just need to specify the URL for the object (since polymorphic_url won’t work) when you create the feed <entry>.

Testing SSL in Rails

Posted by Luke Francl
on Friday, September 12

Here’s a quick tip for how to test that your application is using SSL correctly.

Enabling SSL in tests

You can turn SSL on in functional tests like this:

@request.env['HTTPS'] = 'on'

I have this turned on in my setup method and then override it for tests that don’t use SSL. To turn SSL off, use this:

@request.env['HTTPS'] = nil

(Via snippets)

Testing for SSL redirect

To test and see if you users will get redirected to SSL for particular actions, you can write a test like this. First, it turns off SSL. Then it makes a request to a method that should require SSL, and asserts that the request is redirected to the same URL, but using the https protocol.

1
2
3
4
5
6

def test_get_new_with_http_should_redirect_to_ssl
  @request.env['HTTPS'] = nil
  get :new
  assert_redirected_to "https://" + @request.host + @request.request_uri
end

Implementing the code

Having done this, you probably have a bunch of failing tests. To make them pass, get the SslRequirement plugin (I’m actually using Doug Johnson’s fork that adds support for different SSL domains, like secure.example.com).

Include SslRequirement in application.rb, and then in your secure controllers, add a line like this:

ssl_required :new, :create

Then run your tests.

Now the thing about this is that it is also going to try to send you to an HTTPS URL in development mode. Since most people don’t actually have SSL set up on their development environment, it makes sense to disable this in development.

ssl_required :new, :create if RAILS_ENV 'production' || RAILS_ENV ‘test’

Or for you fancy-pants Rails 2.1 types:

ssl_required :new, :create if Rails.env.production? || Rails.env.test?

(Yeah, you could also use unless Rails.env.development?, but I actually have a couple other environments configured that I do not want to enable SSL for.)

Photo by Darwin Bell.

How to fix your Rails helpers

Posted by Eric Chapweske
on Thursday, August 21

Many Rails applications have this basic structure in their helpers folder:

1
2
3
4
5
6
7
8
9
10
11

application_helper.rb
accounts_helper.rb
audits_helper.rb
comments_helper.rb
images_helper.rb
orders_helper.rb 
posts_helper.rb
sessions_helper.rb
users_helper.rb
... etc. 

The most important file, as we all know, is application_helper.rb, because this is where code goes to die. It’s often a few hundred lines of randomly added, unrelated methods. This is a confusing, scary place for methods to be. Here’s a few tips for rescuing them:

What’s that noise?

Most projects use script/generate to make their controllers. This leaves a ton of empty helper files. Remove them to better focus on the task at hand:

1
2

hg remove accounts_helper.rb audits_helper.rb images_helper.rb ...

Usually this will prune the list down to two or three files.

Farewell, application_helper.rb

The easiest way to clean up the ApplicationHelper module is to remove it. This is a great way to ensure methods don’t stay there, or get inserted in the future. But, if they don’t belong in ApplicationHelper, where’s the best place for them?

1. Remove fake helpers

Helpers are markup generators. If they’re not involved in generating markup, they’re not helpers and can be pushed into a model:

helpers/application_helper.rb
1
2
3
4
5
6

module ApplicationHelper
  def birthday_in_words(child, prefix = 'born')
    "(#{prefix} #{child.birthday_in_words})" if child.birthday?
  end
end
models/child.rb
1
2
3
4
5
6

class Child < ActiveRecord::Base
  def birthday_in_words(prefix = 'born')
    "(#{prefix} #{birthday})" if birthday?
  end
end

Unfortunately Rails relies on this ambigious ‘helper’ naming convention internally, making it tricky to change the naming in your own application. (I find the concept of a helper to be… unhelpful, and will be referring to them as ‘markup generators’ for the rest of this post.)

2. Separate into logical units

By default, Rails makes all markup generators available to any view via helper :all. The relationship between a model and markup generation tends to be incidental, and script/generate’s ‘ModelNameHelper’ convention is a bit sketchy. Better to name it like anything else, so a module that generates, say, HTML for tables, gets named TableHelper.

helpers/table_helper.rb
1
2
3
4
5
6
7
8
9
10

module TableHelper
  def default_sort_column(title, direction)
    ...
  end
  
  def sort_column(title, direction)
    ...
  end
end

Better yet, if the generation starts getting complex, take a page from one of Ryan Bate’s screencasts and turn it into a class

3. Test!

Well organized code is great. Tested, well organized code? Even better!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

require File.join(File.dirname(__FILE__), '..', 'test_helper')
require File.join('action_view', 'test_case')

class AlexaThumbnailHelperTest < ActionView::TestCase

  context "Generating Alexa image tags" do
    setup do
      @url = 'http://ted.com'
      @alexa_image_tag_html = %(<img src="http://ast.amazonaws.com/?...=#{@image_url}"/>)
    end
    
    should "return the image tag as html" do
       assert_equal @alexa_image_tag_html, alexa_image_tag(@url)
    end
  end

end

And a method

On a related note, in a few cases it’s useful to allow your templates access to controller methods. Rails provides helper_method to handle this:

controllers/application_controller.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

class ApplicationController < ActionController::Base
  
  # Give views access to these methods:
  helper_method :current_user, :logged_in?
  
  protected
    def current_user
      ...
    end
    
    def logged_in?
      ...
    end
  
end
References

Disabling ActiveRecord query caching when needed

Posted by Luke Francl
on Monday, August 18

In Rails 2.0 and later, all requests are wrapped in a block that enables query caching.

What this means is that if you execute the exact same query in a single request, the previous results of the query will be returned instead of fetching them from the database again.

Controller actions are wrapped with this automatically, but you can also enable it elsewhere like this:

1
2
3
User.cache do
  # do stuff with caching turned on.
end

However, sometimes you do not want this to happen. For example, if you want to fetch random records from the database, having this cached will cause you to get the same record each time you query.

Fortunately, the cache is easy to disable for parts of your code, with the uncached method (see also):

1
2
3
4
5
6
7
8
9
10
class User < ActiveRecord::Base
  def self.random
    # query for example purposes only -- 
    # ordering by rand() is slow, see here: 
    # http://jan.kneschke.de/projects/mysql/order-by-rand
    uncached do 
      find(:first, :order => "rand()") 
    end
  end
end

Disabling the cache only affects the code within the block, so unlike clearing the cache (which would also work) the rest of your code will still get the benefit of the query cache.

Quick tip: store_location with subdomains

Posted by Jon
on Thursday, May 01

Both restful_authentication and the older acts_as_authenticated have a handy method called store_location. This method stores a URL in a session variable for future reference. The obvious use case involves login. For example, if you’re browsing a product anonymously and want to write a review, you’ll need to sign in first. So if you click a link on that product page that requires you to be logged in, and this sends you through the login process, you’ll ideally want to be returned right back to where you were before you logged in. store_location enables this, along with the redirect_back_or_default(), also provided by Rick Olson’s authentication plugins.

You store a location like this:

1
2
3
4
5
6
7

  def private_action
    unless logged_in?
      store_location
      redirect_to login_path
    end
  end

After authenticating the user, you send them back to the stored location with this:

1
2
3
4
5
6

  def login
    if login_successful? # pseudocode, obviously
      redirect_back_or_default(home_path)
    end
  end

If a location is stored in session, redirect_back_or_default will send the user that location. Otherwise, it redirects to the default path.

This is pretty handy. But unfortunately, it doesn’t jump across domains, including subdomains. Tumblon lets parents set up blogs for their families, and these blogs are either identified by a subdomain (e.g. myfamily.tumblon.com) or by a top-level domain (coming soon). Tumblon also has privacy controls, so I can set a story to be viewable only by my family and friends. So if an anonymous user hits the URL of a private photo/story/video, they should be redirected to the login screen and then right back to the item they were trying to view. But out of the box, store_location can’t handle this.

Let’s look at the store_location method to see why. This method is in lib/authenticated_system.rb.

1
2
3
4

    def store_location
      session[:return_to] = request.request_uri
    end

store_location uses the request.request_uri method, which only provides the relative path (e.g. /photos/932783). So if you login at tumblon.com, store_location won’t return you to myfamily.tumblon.com/photos/932783 – it will send you to tumblon.com/photos/932783. Your app could have logic to redirect from this page to the subdomain, but an easier solution is just to create a new store_location method, like store_location_with_domain. Or you could always override the store_location method to always use request.url instead of request.request_uri if you don’t want a separate method.

1
2
3
4

    def store_location_with_domain
      session[:return_to] = request.url
    end

Put this method in application.rb, and you can now use redirect_back_or_default to hit an exact URL – complete with subdomain, top-level domain, and port.

5 little-known Rails methods

Posted by Eric Chapweske
on Wednesday, April 23

While the next release of Rails appears to be coming up, there’s still plenty of small, useful features from previous releases that aren’t widely used.

A few of my favorites:
  1. query_attribute
  2. polymorphic_path
  3. debug
  4. rake -T `query` ( Not Rails specific, but still handy! )
  5. extract_options!

1. ActiveRecord’s query_attribute

Query methods are available for each of a record’s attributes, providing for a cleaner way to check for the presence of an attribute.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

# == Schema Information
# Schema version: 17
#
# Table name: users
#
#  id                    :integer(11)     not null, primary key 
#  first_name            :string(255)     
#  last_name             :string(255) 

# Original
class User < ActiveRecord::Base
  def named?
    !first_name.blank? && !last_name.blank?
  end
end

# Refactored to use query_attribute
class User < ActiveRecord::Base
  def named?
     first_name? && last_name?
  end
end

2. Indifferent links with polymorphic paths.

Rails has polymorphic edit/new/formatted path routing available out of the box. Providing an array will namespace the path with those array parameters. Available Methods: (edit|new|formatted|)polymorphic_path(record_or_hash_or_array)

1
2
3
4
5
6
7
8
9
10

# Before:
<% if @record.is_a?(User) %>
<%= user_path(@record) %>
<% elsif @record.is_a?(Friend)
<%= friend_path(@record) %>
 ... etc.

# After:
<%= polymorphic_path(@record) %>
Quite a few options are supported, and other Rails methods take advantage of polymorphic routing:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

# Paths can be namespaced:
#=> admin/users/5/edit
edit_polymorphic_path([:admin, @record])

# Polymorphic urls are also internally used by helpers:
# redirects to store_path(@store)
redirect_to @store
  
# builds a form with an action to 'new_admin_stores_path'
#=> <form action="admin/stores/new" ... />
form_for([:admin, Store.new])

# <a href="/stores/5">A store in Minneapolis, MN</a>
link_to @store.name, @store

3. debug

Especially useful when starting out a project, this is quick way to understand what objects are being used in the view.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15


<%= debug @user %>

# Yields this in the view:
# - !ruby/object:User 
#  attributes: 
#    salt: 7be4287a1b27426fa6e5b6d733c707dd66425e82
#    updated_at: 2008-04-22 19:01:23
#    crypted_password: abb611def895dac923ba8ea59a78451f77473d5e
#    id: "1"
#    first_name: Eric
#    last_name: Chapweske
#    created_at: 2008-04-06 01:45:33
#  attributes_cache: {}

4. rake -T task

This is a handy Rake feature, and not limited to Rails. Can’t remember the exact syntax for a particular rake task? Trim the results generated with `rake -T` with an optional search parameter.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

$rake -T

rake annotate_models                 # Add schema information (as comments)...
rake audit:purge                     # Removes Audit records older than 2 m...
rake db:abort_if_pending_migrations  # Raises an error if there are pending...
... etc.
rake tmp:sockets:clear               # Clears all files in tmp/sockets

// Searching by the task's name

$rake -T db:migrate:r

rake db:migrate:redo   # Rollbacks the database one migration and re migrat...
rake db:migrate:reset  # Resets your database using your migrations for the...

5. extract_options!

While not needed very often, Rails comes bundled with a method to extract the options from methods that utilize the splat operator. This method removes the last object from an array if it’s a Hash, otherwise an empty hash is returned.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16


class Story < ActiveRecord::Base

  # Example: 
  # Story.published_and_tagged_with('deep', 'thoughts', :order => 'created_at desc') 
  # The generated options for this method look like this: 
  #=> { :include => :tags, :order => 'created_at desc' }
  def self.published_and_tagged_with(*tag_names)
    options = tag_names.extract_options!
    options[:include] ||= :tags
    
    ...
  end
end

References

I wasn’t able to find any write ups on the above methods, so reading the source code may be the best path if you’re curious about their exact implementations.

Cleaner code with Conversions

Posted by Eric Chapweske
on Friday, February 29

Rails comes with built in support for easily customizing the to_s method. It’s something that’s pretty well documented in the source, but rarely used in practice, which is unfortunate, because the common alternative approach is a bit messy:

1
2

Date.today.strftime('%B %e, %Y') #=> "2008-02-29"

A small, but useful improvement:

1
2

Date.today.to_s(:long) #=> "February 29, 2008"

The underlying implementation is straightforward, with some classes (Date, Time, DateTime, and Range) allowing you to add custom formats. The advantage is all the formatting logic is kept in one place, and available via a unified interface, rather than defined randomly throughout the application.

Write a custom conversion format

Rolling a custom conversion is as easy as adding an entry to the classes’s format hash:

config/initializers/conversions.rb
1
2
3
4
5

# Formats the time using strftime.
# Example: Time.now.to_s(:event) #=> "03:23PM" 

Time::Conversions::DATE_FORMATS.update(:event => '%I:%M%p')

In addition to strings, lambdas are also supported. If the value is a callable object instead of a string, the result of that call will be returned:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

# Formats the date using a lambda.
# Example: Date.today.to_s(:event) #=> "February 29th"

date_formats = { :event => lambda { |date| date.strftime("%B #{date.day.ordinalize}") } }
Date::Conversions::DATE_FORMATS.update(date_formats)

# Examples: 
# (1.day.ago..5.days.from_now).to_s(:event) #=> "February 28th to March 5th"
# (1.hour.ago..3.hours.from_now).to_s(:event) #=> "3:10PM to 7:10PM"
                                          
range_formats = { :event => lambda do |start, stop| 
                              [ start.to_s(:event), stop.to_s(:event) ].join(" to ") 
                            end }
Range::Conversions::DATE_FORMATS.update(range_formats)

A common situation is where both the Date and Time objects should share the same formats. Keeping the format definitions in the same initializer makes this easy:

1
2
3
4
5
6
7
8

  date_formats = { :event => lambda { |date| date.strftime("%B #{date.day.ordinalize}") },
                   :story => ... }

  
  Date::Conversions::DATE_FORMATS.update(date_formats)
  Time::Conversions::DATE_FORMATS.update(date_formats)
  # DateTime uses Time's DATE_FORMATS, so there's nothing to update for it.

A note on naming

Coming up with an easy to remember, expressive name for formats is a bit challenging. The default Rails formats take the approach of trying to describe the result in their names (:short, :long, :long_ordinal). In larger projects, it’s difficult to remember what each format does and where it should be used.

A naming system that’s working a bit better are formats named after their intended use case (:event, :blog, :hours, :event_hours, etc.)

API References

Fuzzing your database for fun and profit

Posted by Luke Francl
on Friday, January 25

Fuzz testing is throwing random data at your application and seeing what breaks. We don’t usually do that. But we often do need lots of semi-realistic data added our development database.

This helps you:

  • see how things will look when there’s more in the site.
  • nail down the indexes you’ll need (Queries that run fine with 10 rows of fixture data fall down on 10,000 rows of random data).

It’s possible to do this with fixtures and ERB but I find it tedious. Plus by using Active Record directly you can guarantee that the objects you’re inserting are valid.

First, create a new rake task in lib/tasks/fuzz.rake:

1
2
3
4
5
6
7
8
9
10
11
namespace :db do
  desc 'Insert some random posts'
  task :fuzz => :environment do
    if RAILS_ENV.downcase == "production"
      raise "You can't fuzz your production environment. Think of the children!"
    end
    
    Fuzz.execute(ENV['SIZE'].to_i)
    
  end
end

You’ll call this with rake db:fuzz SIZE=1000. You can actually put all the code in the rakefile, but it’s a little easier to manage to split it out into a separate class.

In lib/fuzz.rb, write something like this example, which finds a random user and adds a post from them to the system SIZE times. The fuzz script could do anything you want, though.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
class Fuzz
  ActiveRecord::Base.establish_connection(RAILS_ENV.to_sym)

  # This file location varies by OS. This is the Mac OS X location.
  # At 2.4M, you have plenty of RAM to read it all into memory!
  @@words = File.open("/usr/share/dict/words").collect do |line|
    line.strip
  end
  
  def self.execute(size)
    if size == 0 or size.nil?
      size = 100
    end

    ActiveRecord::Base.silence {
      User.transaction do
        size.times do
          user = User.find(:first, :order => "rand()")
          user.posts.create!(:body => random_words(rand(30))
        end
        puts "Created #{size} posts"
      end
    }
  end
  
  # provide a string with num random words in it.
  def self.random_words(num = 1)
    w = []
    num.times do
      w << @@words[rand(@@words.size)]
    end
    w.join(" ")
  end
  
end

Silencing the logger and using a transaction makes the code execute faster. Which can be a problem if you’re running 10,000 of these. Another thing you can do to speed things up is disable timestamps, but I’ve found that causes more trouble than it’s worth, because you often want to use those timestamps in your app!

Extra credit: While the data generated from random dictionary words is often hilarious, it’s not very realistic. Use Faker to create more realistic fake data and sometimes to randomize those non-required fields.

Remote MySQL GUI with SSH

Posted by Luke Francl
on Tuesday, August 28

Back in my PHP/MySQL days I used to be quite the MySQL console jockey. I used it for all kinds of stuff. Then I got a new job, moved to DB2 and thankfully forgot as much as I could about MySQL. Now I’m doing Rails and working with MySQL again. But these days I use CocoaMySQL for nosing around the database on my local machines.

On remote servers, I was still using the console, but I recently found this trick which allows you to open up CocoaMySQL on a remote database using an SSH tunnel. The database doesn’t have to be configured to accept connections from outside of localhost.

Here’s how it works.

First, create an SSH tunnel.

ssh -L 8888:example.com:3306 user@example.com

Here I’m connecting the free port 8888 on my local machine to 3306 (the MySQL port) on the remote server, logging in as user.

Then configure CocoaMySQL to use the tunnel. Set the host to 127.0.0.1 and the port to 8888. The user, database, and password will be that of your remote server.

(There’s a section in the config screen to use an SSH tunnel, which I think is supposed to create the tunnel automatically, but I wasn’t able to get that to work.)

I’ve found this tip useful in my work. Hopefully you will too!