ActiveRecord refererential integrity is broken. Let's fix it!

Posted by Jon
on Tuesday, August 18

ActiveRecord supports cascading deletes to preserve referential integrity:

1
2
3
class User
  has_many :posts, :dependent => :destroy
end

But you really only want cascading deletes about half the time. The other half, you want to actually restrict deletion of a record with dependencies. ActiveRecord doesn’t support this.

Think of an e-commerce system where a user has many orders. Once an order has gone through, you shouldn’t be able to delete the user who placed the order. You need a record of the order and the user who placed it.

Or even more obvious, think of a lookup table. An Order might have several of these dependencies; OrderStatus, Currency, DiscountLevel, etc. In all of these cases, you want ON DELETE restrict, not ON DELETE cascade. But Rails doesn’t support this. That’s dumb.

If you agree, head on over to the Rails UserVoice site and make your opinion known! There is a ticket for this already. Vote it up if you think Rails should implement this.

The solution to the problem is really pretty simple. ActiveRecord just needs something like this:

1
2
3
class User
  has_many :posts, :dependent => :restrict
end

In this case, if you try to destroy a user that has one or more posts, Rails should complain. You’ve told the app: “Don’t let me delete users who have posts!” The easiest way to do this is to have Rails throw an exception, and have your controller capture the exception and print a flash message. Other approaches could work too.

So why is this important?

1. It’s common. Every project should maintain referential integrity in some way, and :dependent => :destroy isn’t always appropriate. Who wants to do a cascading delete from roles to users, or manufacturers to products, or order_statuses to orders? I don’t think I’ve ever worked on a project where cascading deletes were always appropriate. Any lookup table, at minimum, needs this feature. (I personally prefer to maintain referential integrity with foreign keys, but even still, I’d love to have an application-level check first, which would be easier to rescue. And some projects don’t use foreign keys.)

2. It fits with the Rails philosophy. Rails says “Let your application handle referential integrity, not the database”. But without :dependent => :restrict, one of the most important pieces of referential integrity is missing.

3. It’s easy. 9 lines of code to add this to has_many. Check out this gist: http://gist.github.com/170059.

Someone wrote a plugin for this, but it has the distinct disadvantage of not working anymore. This should really be a core feature anyway, at least as long as :dependent => :destroy is a core feature.

The UserVoice suggestion for this is at http://rails.uservoice.com/pages/10012-rails/suggestions/103508-support-dependent-restrict-and-dependent-nullify.

Today's hard-won lesson

Posted by Luke Francl
on Thursday, March 05

An float subtracted from an integer results in a float. When typecast by ActiveRecord, this is converted to an integer.

validates_numericality_of with :only_integer => true results in a rather obscure error message if a non-integer is present (“is not a number”).

validates_numericality_of uses the attribute value before type cast for its validation.

...

That means if you calculate a value before validation, what is printed out by using the attribute method is different than what validates_numericality_of is using for validation. You need to ensure that the value of attr_name_before_type_cast is an integer!

I just spent way too much time figuring this out.

Is your Rails application safe?

Posted by Eric Chapweske
on Monday, September 22

Rails provides many great security features. It’s design can also create significant security holes. In the case of ActiveRecord’s mass assignment vulnerability, the security issues are more servere and widespread than many of us recognize.

Nearly every open source Rails application I’ve seen is vulnerable, and most closed source ones as well. There’s some great solutions for protecting your application from attack, but first, the problem:

The Problem

By default ActiveRecord allows visitors access to any writer method, that is, any method ending with an equal sign. This comes courtesy of the ActiveRecord::Base#attributes= method, which is used internally by the main methods that handle creating and updating records, including new(), create(), and update_attributes().

The way most applications are designed means that whatever data a visitor sends to the server will likely find its way through the attributes=() method, and if not protected, ActiveRecord will happily update the records based on what was sent. In less technical terms: ActiveRecord is insecure by default.

As an example, let’s look at a request against vulnerable code:

1
2
3

# The request
$ curl -X PUT -d "order[price_in_cents]=0" example.com/orders/225
app/models/order.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

class Order < ActiveRecord::Base
  # Table name: orders
  #  id          :integer(11)     not null, primary key 
  #  price_in_cents     :integer(11)
  #  user_id     :integer(11)     
  #  state       :string(255)               

  has_many :line_items
  
  acts_as_state_machine :initial => :pending

  state :pending
  state :paid
    
  def name
    ... 
  end
  
  def shipped_on=(shipping_date)
    ...
  end

end
app/controllers/orders_controller.rb
1
2
3
4
5
6
7
8
9

class OrdersController < ApplicationController
  ...

  def update
    @order.update_attributes(params[:order])
  end

end
Pop quiz: which Order instance methods are exposed to the world?
  • Attributes generated from its table: price_in_cents=, user_id=, state=
  • Attributes generated by association macros: line_item_ids=
  • Other defined writer methods: shipped_on=

Ruby’s dynamic nature and ActiveRecord’s changing API make this excercise more of a guess than anything else. Does Rails 2.1 dynamicly generate different writer methods? Will Rails 2.2? How about the plugins and libraries the application relies on?

Theoritically, this isn’t a problem since ActiveRecord provides a solution out of the box: “Sensitive attributes can be protected from this form of mass-assignment by using the attr_protected macro. Or you can alternatively specify which attributes can be accessed with the attr_accessible macro”

The Reality

Naturally, profesional developers experienced with the framework use attr_accessible/attr_protected and don’t suffer from these problems. As a quick poll, here’s a few of the more popular open source code bases:

  1. Insoshi is second only to Rails as a top forked project on github. It’s a social networking app developed by a seed funded startup whose team includes the author of a very well-reviewed book on developing social applications in Rails.
  2. Mephisto, the Rails-based blogging application, which Railspikes runs on.
  3. Anonymous App is a large Rails project with seasoned developers. I’m withholding its details since it has security issues that are still being addressed.
  4. Rubyflow, the codebase of Peter Cooper’s very useful Ruby news aggregation site.
  5. Spree is a rapidly-maturing ecommerce project and powers RailsEnvy’s new screencast store.

Good projects. Professional developers. Every project except Mephisto is vulnerable. Any forum thread in Insoshi will raise exceptions and be unusable after the user_id is changed to a non-existent user. A similiar approach worked on Anonymous and Rubyflow. Since these projects lacked any strategy for handling this kind of problem, it’s highly probable that much more damaging attacks exist. One example: Spree’s public exposure of the 'state' attribute allowed me to make my order appear as though it was paid for when I hadn’t even entered my payment information. While these projects vary in terms of risk, in each case the cost of solving this issue is cheap when compared to the cost of cleaning up after an attack.

I’m singling out these applications because they’re popular and open source, but every project I’ve developed has experienced the same security issues. The only thing that seems to change is how much data is vulnerable and how important it is. It’s a difficult problem to manage. Retrofitting security on existing code is a very unpleasant experience. It’s easy to forget when developing new applications. Educating other developers on the problem has proved unreliable.

As an aside, I’m impressed by the response of the developers on these projects. Insoshi, Rubyflow, and Spree addressed the issue almost instantly after being informed. It was a reminder to me of how lucky I am to be involved in such a passionate, professional community. Michael Hartl of Insoshi went so far as to write a mass assignment auditing plugin and offers some great advice on how he ended up tackling the problem.

A solution

  1. Don’t use attr_protected. I haven’t seen a compelling use case for it. It’s functionality is confusing. It should probably be removed from ActiveRecord.
  2. Do use attr_accessible. Its white list approach forces an explicit decision on the mass assignablity of attributes. A rule of thumb: if an attribute shouldn’t be in a user submitable form, it shouldn’t be accessible.
  3. Review and audit. Even with attr_accessible, a developer can still shoot themselves in the foot without code audits and reviews. Even if the application is secure today, holes will eventually be introduced into the code. In addition to peer review, automated auditing tools are a great, inexpensive way to find such security problems.
  4. Make it automatic. Disable mass assignment by default, requiring attr_accessible to be specified for each attribute. I’ve taken this approach on maybe 5 projects now. Here’s how to do it:
config/initializers/disable_mass_assignment.rb
1
2

ActiveRecord::Base.send(:attr_accessible, nil)

It’s worked quite well, with the exception of two cases where I had to retrofit it on larger applications. That was a nightmare. I’ve been tinkering with a plugin that aims to reduce some of the problems caused by attr_accessible, and make retrofitting a more pleasant experience. It’s not production ready, but I think there’s some small improvements in it worth stealing.

The downside: it’s pretty much a guarantee that you’ll run into confusing bugs during development. This is a major problem for developers new to the framework, and is annoying for the more experienced. ActiveRecord used to raise exceptions in development when mass assignment was attempted with an inaccessible attribute. This was great, but there were a few complaints, and conflicts with ActiveResource, so the change was pulled.

A better solution?

An alternative approach worth exploring is the route taken by Merb, which decided this is the controller’s problem, and has a plugin providing params_accessible functionality. There’s a similar plugin for Rails . This approach may be especially appreciated by developers who want to add some level of protection to an existing application, since less code needs to change.

I’ve hesitated to use this on applications that use ActiveRecord, which has a bad habit of making methods part of the public api when they should be privately scoped (those ending in _id, _ids, _count, most enumerables, etc) Because of this, attr_accessible serves double duty by discouraging public use of writer methods that should be private. Not really the best excuse, and I’d like to give the params_protected approach a try on my next Rails project.

Regardless of the solution, the cost of designing applications to handle potential mass assignment abuse from the beginning is so much cheaper than attempting to retroactively address the issue. Rails should step up and encourage such design decisions. Whether it’s something as extreme as disabling mass assignment from the start, or an unobtrusive change like adding a commented out attr_accessible line in generated models, the risk shouldn’t be ignored.

Security Tools

There’s a few other related tools that look promising for developing securer code:
  • Tarantula: A fuzzing plugin that spiders your application looking for problems. Via Stuart Halloway’s post on Revelance’s blog: “It crawls your rails app, fuzzing inputs and analyzing what comes back. We have pointed Tarantula at about 20 Rails applications, both commercial and open source, and have never failed to uncover flaws.” Aaron Bedrak’s Rails Security Audit PDF on Peepcode devotes significant space to getting this up and running. It also covers a few of the common mistakes developers can make when using a framework like Rails, and that alone may make it a worthwhile read.
  • ratproxy: Happened upon this on Google’s excellent security blog . From their announcement post: “[ratproxy] is designed to transparently analyze legitimate, browser-driven interactions with a tested web property and automatically pinpoint, annotate, and prioritize potential flaws or areas of concern.”
  • Audit Mass Assignment: Scans ActiveRecord models looking for potential mass assignment mistakes.
  • Find Mass Assignment: Searches controller actions for likely mass assignment, and then find the corresponding models that don’t have attr_accessible defined.
References

Disabling ActiveRecord query caching when needed

Posted by Luke Francl
on Monday, August 18

In Rails 2.0 and later, all requests are wrapped in a block that enables query caching.

What this means is that if you execute the exact same query in a single request, the previous results of the query will be returned instead of fetching them from the database again.

Controller actions are wrapped with this automatically, but you can also enable it elsewhere like this:

1
2
3
User.cache do
  # do stuff with caching turned on.
end

However, sometimes you do not want this to happen. For example, if you want to fetch random records from the database, having this cached will cause you to get the same record each time you query.

Fortunately, the cache is easy to disable for parts of your code, with the uncached method (see also):

1
2
3
4
5
6
7
8
9
10
class User < ActiveRecord::Base
  def self.random
    # query for example purposes only -- 
    # ordering by rand() is slow, see here: 
    # http://jan.kneschke.de/projects/mysql/order-by-rand
    uncached do 
      find(:first, :order => "rand()") 
    end
  end
end

Disabling the cache only affects the code within the block, so unlike clearing the cache (which would also work) the rest of your code will still get the benefit of the query cache.

5 little-known Rails methods

Posted by Eric Chapweske
on Wednesday, April 23

While the next release of Rails appears to be coming up, there’s still plenty of small, useful features from previous releases that aren’t widely used.

A few of my favorites:
  1. query_attribute
  2. polymorphic_path
  3. debug
  4. rake -T `query` ( Not Rails specific, but still handy! )
  5. extract_options!

1. ActiveRecord’s query_attribute

Query methods are available for each of a record’s attributes, providing for a cleaner way to check for the presence of an attribute.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

# == Schema Information
# Schema version: 17
#
# Table name: users
#
#  id                    :integer(11)     not null, primary key 
#  first_name            :string(255)     
#  last_name             :string(255) 

# Original
class User < ActiveRecord::Base
  def named?
    !first_name.blank? && !last_name.blank?
  end
end

# Refactored to use query_attribute
class User < ActiveRecord::Base
  def named?
     first_name? && last_name?
  end
end

2. Indifferent links with polymorphic paths.

Rails has polymorphic edit/new/formatted path routing available out of the box. Providing an array will namespace the path with those array parameters. Available Methods: (edit|new|formatted|)polymorphic_path(record_or_hash_or_array)

1
2
3
4
5
6
7
8
9
10

# Before:
<% if @record.is_a?(User) %>
<%= user_path(@record) %>
<% elsif @record.is_a?(Friend)
<%= friend_path(@record) %>
 ... etc.

# After:
<%= polymorphic_path(@record) %>
Quite a few options are supported, and other Rails methods take advantage of polymorphic routing:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

# Paths can be namespaced:
#=> admin/users/5/edit
edit_polymorphic_path([:admin, @record])

# Polymorphic urls are also internally used by helpers:
# redirects to store_path(@store)
redirect_to @store
  
# builds a form with an action to 'new_admin_stores_path'
#=> <form action="admin/stores/new" ... />
form_for([:admin, Store.new])

# <a href="/stores/5">A store in Minneapolis, MN</a>
link_to @store.name, @store

3. debug

Especially useful when starting out a project, this is quick way to understand what objects are being used in the view.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15


<%= debug @user %>

# Yields this in the view:
# - !ruby/object:User 
#  attributes: 
#    salt: 7be4287a1b27426fa6e5b6d733c707dd66425e82
#    updated_at: 2008-04-22 19:01:23
#    crypted_password: abb611def895dac923ba8ea59a78451f77473d5e
#    id: "1"
#    first_name: Eric
#    last_name: Chapweske
#    created_at: 2008-04-06 01:45:33
#  attributes_cache: {}

4. rake -T task

This is a handy Rake feature, and not limited to Rails. Can’t remember the exact syntax for a particular rake task? Trim the results generated with `rake -T` with an optional search parameter.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

$rake -T

rake annotate_models                 # Add schema information (as comments)...
rake audit:purge                     # Removes Audit records older than 2 m...
rake db:abort_if_pending_migrations  # Raises an error if there are pending...
... etc.
rake tmp:sockets:clear               # Clears all files in tmp/sockets

// Searching by the task's name

$rake -T db:migrate:r

rake db:migrate:redo   # Rollbacks the database one migration and re migrat...
rake db:migrate:reset  # Resets your database using your migrations for the...

5. extract_options!

While not needed very often, Rails comes bundled with a method to extract the options from methods that utilize the splat operator. This method removes the last object from an array if it’s a Hash, otherwise an empty hash is returned.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16


class Story < ActiveRecord::Base

  # Example: 
  # Story.published_and_tagged_with('deep', 'thoughts', :order => 'created_at desc') 
  # The generated options for this method look like this: 
  #=> { :include => :tags, :order => 'created_at desc' }
  def self.published_and_tagged_with(*tag_names)
    options = tag_names.extract_options!
    options[:include] ||= :tags
    
    ...
  end
end

References

I wasn’t able to find any write ups on the above methods, so reading the source code may be the best path if you’re curious about their exact implementations.

Rose^h^h^huby Colored Glasses

Posted by Dan Grigsby
on Saturday, May 19

Earlier this year, I attended Avi Bryant’s Applied Web Heresies tutorial session at ETech. Starting from scratch, over about four hours, the other students and I each created a mini-Seaside framework in the language of their choice.

The details of the tuturial are worthy of a post, and Nick Sieger and I are working on one. For this post, I’ll just use his thesis from the talk as a starting point. Avi says,

“A lot of the design decisions that are unconsciously adopted by modern frameworks like Rails or Django were made 10 or 15 years ago. They were probably good decisions at the time, but we need to re-evaluate them now that we have new and better tools available to us.”

The “unconsciously adopted” part of his thesis has been bouncing around in my head since then, and showed up in an unexpected place that I want to explore – think out, really – in this post.

To get where I want to go with this post I’ll need to go a bit off-topic for a couple of paragraphs, but I’ll bring it back quickly, so bear with me.

About a month ago, a group of us – including fellow Rail Spikes contributors Luke and Jon – decided to teach ourselves functional programming. With the help of the Pragmatic Programmers’ Beta Book, I decided to tackle Erlang.

Erlang’s primary data structure unit is a “tuple.” Adapting an example from the book, here’s a tuple that represents a single contact in an address book:

1
2
3
4
5

{person,
  {first_name, "Dan"},
  {last_name, "Grigsby"}
}.

Even if you’ve never seen erlang before, it should be pretty clear what’s going on. And, of course, an address book would normally have more fields, but this simplified example will be good enough.

Like Ruby, Erlang uses square brackets to signify lists. A list of people in an address book would look like this:

1
2
3
4
5
6

AddressBook = [
  {person, {first_name, "Dan"}, {last_name, "Grigsby"}},
  {person, {first_name, "Kristy"}, {last_name, "Grigsby"}},
  {person, {first_name, "Dan"}, {last_name, "Buettner"}}
].

Now that we’ve covered this little bit of Erlang, it’s time to bring this post back onto topic. Recalling Avi’s unconscious adoption thesis, I want to talk about about databases and ORMs.

As Rails developers, we spend a lot of time working to “round trip” Ruby objects into and out of the database. While ActiveRecord makes this translation considerably simpler than other ORMs, ultimately we still put in work to get what we want:

We want Ruby objects.

As an experiment, let’s think about working exclusively with lists of pure Ruby objects within the context of the CRUD (Create, Read, Update, Destroy) verbs, since that reflects pretty well what we do with ActiveRecord.

To keep this discussion anchored, let’s port our Erlang Person tuple/record to Ruby and work with that:

1
2
3
4
5
6
7
8

class Person
  attr_accessor :first_name, :last_name
  
  def initialize(first_name = nil, last_name = nil)
    @first_name, @last_name = first_name, last_name
  end
end

We want an Address book, which we’ll represent as a list of Person objects, so let’s add that:

1
2
3
4
5

address_book = []
address_book << Person.new('Dan', 'Grigsby')
address_book << Person.new('Kristy', 'Grigsby')
address_book << Person.new('Dan', 'Buettner')

Before we go on, we should take a second to think about storage:

Obviously, Ruby objects can be stored in memory; keep them in a long running process and you’d be able to work with them through a series of http requests. Serialize them and you can store them on disk for when your long-running process dies or needs to be shut down.

Alright, back to CRUD verbs:

Let’s knock down the basics quickly:

Create is demonstrated above: use the Person.new method to create an object and then add it to the address_book list. To Read an object, reference it by its index: address_book[0]. Update follows suit: address_book[0].first_name = 'Bob'. To Destroy, use any of the delete/slice/shift/pop array methods.

You probably noticed that I cheated a bit. My Read example is the equivalent of an ActiveRecord find using an ID. If you happen to know the list-offset then you can retrieve your element. The question becomes: how can we selectively choose element like you could do in a WHERE part of a SQL clause or ActiveRecord :conditions argument?

To answer this quetion, I’ll select out a list of people whose first name is Dan. In SQL, this’d be SELECT * FROM PEOPLE WHERE FIRST_NAME = 'Dan'. In ActiveRecord, :conditions => ["asks.state = 'open'"] would do it. With an array of Person objects, in Ruby this would do it:

1
2
3
4
5

dans = address_book.inject([]) do |results, person|
  results << person if person.first_name == 'Dan'
  results
end

(Update: See Chris Carter’s comment below about using Enumerable’s .select for a better way to do this.)

Similarly, Ruby’s Array method “delete_if” could be used to selectively delete from the list.

Using nothing other than pure Ruby objects and standard, simple Array methods we’ve demonstrated most of the common uses of ActiveRecord. We never had to leave Ruby to do this.

There’s a simplicity and an odd-elegance to this that appeals to me.

If that sounds crazy to you, consider that Mnesia, Erlang’s wickedly fast, fault-tolerant DBMS operates in the mode I’ve described the process here: it stores lists of native Erlang records (in memory, to disk, or both) and you apply list functions to select items you want. Here’s the select statement demonstrated above in SQL, ActiveRecord and Ruby, but this time in (I’m sure very bad) Erlang:

1
2

[ {person, {first_name, "Dan"}, {last_name, X}} || {person, {first_name, "Dan"}, {last_name, X}} <- AddressBook ].

Erlang has Mnesia. Someone who wears “rose colored glasses” could be said to have selective amnesia. Maybe the world needs an adaptation of Mnesia to Ruby: Ruby Colored Glasses.

I’m certain that there are more or less complete Ruby variants of Mnesia. The purpose of this article isn’t to suggest that there should be another one – if you’ve used one, use the comments to tell me about it. Rather, this post examines we can learn by throwing out one of the things we’ve all “unconsciously adopted” with Rails. Stay tuned for my forthcoming – and aforementioned – article on the mini-Seaside I built in Avi’s tutorial where we’ll question other assumptions to interesting ends.

Testing ActiveRecord Transactions

Posted by Luke Francl
on Wednesday, March 28

ActiveRecord allows you to start transactions that will be rolled back in the event of an error.

A good example is importing records from a CSV file. If you want the entire import to roll back if any of the rows fail to import, you could write your code like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
def import_csv
  csv_file = params[:csv_file]
  
  begin
    Record.transaction do 
      fastercsv = FasterCSV.new( csv_file )
      while row = fastercsv.readline
        foo, bar = row
        Record.create!( :foo => foo, :bar => bar )
      end
    end
    redirect_to success_action_path
  rescue 
    # do something with the error
    flash[:error] = "CSV import failed"
    redirect_to import_path
  end
end

The Record.create! call will throw an ActiveRecord::InvalidRecord error if one of the rows can’t be saved. Then the rescue block catches the error and reports it to the user instead of showing them an ugly 500 error (or, worse, a corrupted import).

However, this doesn’t play nicely with your tests.

You’d like to do something like this:

1
2
3
4
5
def test_import_csv_failure
  assert_no_difference Record :count do 
    post :import_csv, :csv_file => fixture_file_upload('files/invalid.csv')
  end
end

But this won’t work, because running the test starts a transaction, and ActiveRecord doesn’t support nested transactions. There’s been a patch open on this problem for 9 months, but no action has been taken.

I was able to work around the problem by turning off transactional fixtures for the entire test case class.

1
2
3
4
5
6
7
8
9
class MyTest < Test::Unit::TestCase
  self.use_transactional_fixtures = false

  def test_import_csv_failure
    assert_no_difference Record :count do 
      post :import_csv, :csv_file => fixture_file_upload('files/invalid.csv')
    end
  end
end

This makes the test run slower, but now it passes. If you’re feeling adventurous, you can install the ActiveRecord nested transactions plugin.

Lots of people have hit this problem. Jerry Kuch blogged about it in January 2006 and ticket 5457 was filed back in June. But hopefully this post will help someone else figure out the problem.