Lambda, etc. in Ruby 1.9

Posted by Jon
on Monday, September 08

Ruby 1.9 introduces some improvements to Ruby’s lambdas. This is great, in my opinion; much of the power and beauty of Ruby comes from its combination of object oriented programming (the dominant paradigm) with functional programming (baked in quite deeply). I’m glad to see Matz et al committed to improving Ruby’s functional programming support.

Here are a few of the changes in Ruby 1.9.

1. Block arguments are local. This is an obvious choice and a nice improvement. In Ruby 1.8, block arguments clashed with local variables. For example:

1
2
3
4
5
6
7
i = "hello"
3.times { |i| puts i }
puts i
# 0
# 1
# 2
# 2

We see that the i variable was overwritten by the i block argument. You almost never want this to happen. And Ruby 1.9 fixes this.

1
2
3
4
5
6
7
i = "hello"
3.times { |i| puts i }
puts i
# 0
# 1
# 2
# "hello"

2. proc is now an alias of Proc.new. In 1.8, proc was an alias for lambda, and Proc.new was slightly different. An improvement, definitely – but why do we need both proc and Proc.new (plus blocks, plus method(:foo), plus the new stabby lambda)?

3. New lambda syntax. Before I show the “stabby lambda” (as described by David A. Black at RailsConf Europe), why do we need another way to define a closure? Doesn’t Ruby already have enough, from blocks to Proc.new to proc to lambda to method()? (See Paul Cantrell’s classic closures in Ruby for more on this.)

The answer is that Ruby’s block/lambda syntax has a significant limitation. Arguments are defined between pipes like this


{|a,b| #code }

or

1
2
3
do |a,b|
# code
end

But block arguments have limitations that regular method arguments don’t. In Ruby 1.8, you couldn’t do the following, though it looks like you can in 1.9.


{|&b| #code }

But default parameter values are still out. This code doesn’t work in either 1.8 or 1.9:


{|a = 0| #code }

And that’s too bad. For blocks to be true anonymous methods, they should work like regular methods, which includes allowing default values for optional arguments.

Apparently, it would be really tough to get this to work with Ruby’s parser, because this would introduce the possibility of |a, b=c|d| (the middle pipe being an “or”), and the parser would get confused. I’m not a language designer, so I don’t know just how tough this problem is, but I would love to see it solved. Because the alternative is the “stabby lambda”.


a_function = ->(a, b=0) { # code }

If there’s one thing that Ruby doesn’t need, it’s a new lambda syntax. Again, see Paul’s article on the subject. And while I don’t necessarily mind the syntax of the stabby lambda – it’s among the uglier things in Ruby, but I could live with it – I really wish we didn’t need another lambda syntax to fix a shortcoming of the Ruby parser.

There may be a second reason for this -> operator – synatctic sugar. Some (including Dave Thomas) consider the stabby lambda syntax to be more clear when passing two anonymous functions to a method. Ruby methods only allow a single argument to be received as a block, like this:

1
2
3
4
5
6
7
def some_method(&b)
  b.call
end

some_method do 
  puts "hello world"
end
You can’t, for instance, pass two such blocks to a method, like:
1
2
3
4
5
6
7
8
def some_method(&a, &b)
  a.call
  b.call
end

some_method { puts "first block" } do 
  puts "hello second block"
end

Instead, if you want to pass two such functions to a method, you have to explicitly pass one as a lambda or proc, like so:

1
2
3
4
5
6
7
8
def some_method(a, &b)
  a.call
  b.call
end

some_method lambda { puts "first block" } do 
  puts "hello second block"
end

Notice the lambda keyword? It’s reasonably clear what is going on here, but the “method_name lambda {}” syntax is a little funny. So as a possible bonus, the stabby lambda enables a shorter, and (possibly?) more clear syntax:

1
2
3
some_method -> { puts "first block" } do 
  puts "hello second block"
end

I’m mixed on whether this is an improvement or not. Personally, I embrace the lambda keyword, because doing so has helped me to better understand the language and connect Ruby to my explorations in functional programming (like SICP). So if I had to pass two anonymous functions to a method, I’d probably stick with the lambda keyword in Dave’s first example, but parenthesized to make it clear that it’s just an argument:


some_method(lambda { puts "first block" }) { puts "second block" }

There are other minor changes to procs, blocks, and lambdas in Ruby 1.9. See this Ruby 1.9 changelog at Eigenclass for more details, including dozens of other non-lambda-related changes to Ruby 1.9, like fibers, enumerators, new methods, and syntax changes.

Comments

Leave a response

  1. Jason WatkinsSeptember 09, 2008 @ 12:57 AM

    Apparently, it would be really tough to get this to work with Ruby’s parser, because this would introduce the possibility of |a, b=c|d| (the middle pipe being an “or”), and the parser would get confused. I’m not a language designer, so I don’t know just how tough this problem is, but I would love to see it solved.

    The problem is Ruby uses a parser that makes lookahead painful (yacc). More modern parsing algorithms deal with this sort of ambiguity better, but coming up with a Ruby grammar for them would is no small task. XRuby came up with an ANTLR grammar that I assume would be a good starting point, but good luck convincing ruby core to use ANTLR.

    General parsers also have some performance implications, but I’m skeptical that’d matter.

  2. Jon DahlSeptember 09, 2008 @ 09:10 AM

    Thanks for the insight, Jason.

    How about another option: don’t allow an | (or) operator in block argument definition, but allow a default value? Honestly, I’ve defined a thousand b=0 arguments for every b=c|d. Suppose that’s already been discussed on the Ruby core mailing list. :)

  3. DrewSeptember 09, 2008 @ 09:41 AM

    Forgive my ignorance here, but why couldn’t we simply use the existing lambda keyword and have it function differently if given arguments. For example:

    1. standard behavior lambda { |x| puts x }
    1. new behavior lambda(x = 2) { puts x }

    Can the parser not handle this? I think it is much cleaner.

  4. Magnus HolmSeptember 09, 2008 @ 10:05 AM

    I totally agree with Drew here: Why not make “lambda” a keyword instead of ”->”? It’s much more descriptive!

  5. Jon DahlSeptember 09, 2008 @ 10:25 AM

    Drew and Magnus: I don’t see any reason why we couldn’t do that, and that is basically exactly what the -> operator is doing. The problem is that it would only work in cases where a function is being explicitly defined, like Drew’s example. It wouldn’t work when passing a block to a method, because you don’t actually use the lambda keyword there, and so wouldn’t have anywhere to put the parameters.

    For example:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    
    
    def sum_two_numbers(a,b,&c)
      c.call(a + b)
    end
    
    sum_two_numbers(1,2) do |i| 
      puts i
    end
    # 3
    

    In this (lame) example, sum_two_numbers takes two arguments (a and b), and the block takes one argument (i). The block argument definition isn’t tied to a lambda keyword. So ->(a,b=1) or lambda(a,b=1) doesn’t really help us out because we don’t have a -> or a lambda.

    That said, I like the idea of allowing lambda(a,b=1) for cases when you’re explicitly declaring an anonymous function.