Ruby parsing ambiguities

I was reading Perl Cannot Be Parsed: A Formal Proof on PerlMonks over breakfast this morning (this may in itself cause you to worry about me), which introduced me to a clever, ambiguous snippet of Perl constructed by Randall Schwartz:

whatever  / 25 ; # / ; die "this dies!";

How this is parsed depends on what whatever is: if it’s a function that takes an argument, the slashes delimit a regular expression, and the following statement kills the program.

If, on the other hand, whatever is a function without any arguments, its return value is divided by 25, and the rest of the line is a comment.

Since it’s possible to define whatever dynamically, the snippet can’t be parsed without running the program up to that point. Ergo, Perl cannot be parsed statically.

But can we do the same in Ruby? Initially, it seems possible. The disambiguation of slashes has an additional nicety in Matz’s Ruby interpreter: if there’s a space after the opening slash, it will always be interpreted as a division operator. This works, though:

whatever /25#/; raise 'this dies!'

The meaning of the line depends on whether whatever is a method or a variable:

whatever = 100
whatever /25#/; raise 'this dies!'
# completes
def whatever(re) end
whatever /25#/; raise 'this dies!'
# dies

So it looks like we can perform the Perl trick and use code to define what whatever is, thereby creating Ruby code that can’t be parsed. But, in fact, it doesn’t work, because Ruby’s cleverer than that. And by clever, I mean evil. Ruby looks at everything in the current context—even unreachable code—to determine what the symbol whatever refers to. So this works as you’d expect:

if false
  def whatever() 200 end
else
  whatever = 100
end
whatever # => 100

But reverse the logic, and something strange happens:

if true
  def whatever() 200 end
else
  whatever = 100
end
whatever # => nil

Even though the method is defined, and the variable isn’t, the parser has ‘seen’ the variable, and the meaning of the symbol is changed into a variable. The message :whatever is never sent, because it refers to a variable. But that variable isn’t defined! Instead of an error, though, we get nil.

If you’re thinking of getting round it by setting up the definitions in eval, I’ve bad news: that doesn’t work for variables:

eval "def whatever() 200 end"
whatever # => 200
eval "whatever = 100"
whatever # => NameError: undefined local variable or method `whatever' for main:Object

So there you go. Ruby’s syntax may have some apparent ambiguities, but they can be resolved statically. At least, unless there are any others I don’t know about …

Incidentally, it’s testament to the hard work that the JRuby developers have put in that it has exactly the same behaviour.

Comments

Skip to the comment form

  1. Piers Cawley

    Wrote at 2008-01-29 10:32 UTC using Firefox 3.0b2 on Mac OS X:

    Except…

    if true
      def whatever(it)
      end
    else
      eval "whatever = 100", binding
    end
    
    whatever /25#/; raise "this dies" # RuntimeError: this dies

    Hmm… Can’t seem to get the code to format as code… Still, voila, parsing ruby can’t be done statically.

  2. Paul Battley

    Wrote at 2008-01-29 11:42 UTC using Firefox 2.0.0.11 on Mac OS X:

    I fixed the formatting in your comment, Piers. Sorry about the inconvenience: it turns out that Textpattern wasn’t a great choice for writing about programming.

    The eval in your example is, I think, a bit of a red herring, as it doesn’t actually define whatever if the logic is inverted. If whatever is defined dynamically, it doesn’t matter: the line has already been parsed in the unevaluated context, so whatever is a method to be called—one that happens to be undefined:

    eval "whatever = 100", binding
    whatever /25#/; raise "this dies"
    # NoMethodError: undefined method ‘whatever’ for main:Object

    This means that the second line has already been parsed as a method call with one argument, and that this happened before the evaluation of the first line. The fact that the method that ends up being called is defined later isn't relevant: the meaning of the line can be determined by static parsing.

  3. Piers Cawley

    Wrote at 2008-01-29 12:48 UTC using Unknown browser on Mac OS X:

    Hmm… this is what comes of trying it in irb.

    You can still cause problems by deferring the evaluation of the ambiguous code:

    eval "whatever /25#/; raise 'this dies'"

    But that’s not exactly sporting is it?

  4. Aman Gupta

    Wrote at 2008-01-30 05:47 UTC using Safari 523.10.6 on Mac OS X:

    It is not possible to define a local variable using eval, even if you’re passing in a binding.

    The only reason it might appear to work is because irb is running inside eval itself.. try with ruby -e ‘’ instead.
  5. Piers Cawley

    Wrote at 2008-01-30 08:24 UTC using Unknown browser on Mac OS X:

    So, now I’m curious. How come you can define local variables in irb at all?
  6. Charles Oliver Nutter

    Wrote at 2008-01-31 03:01 UTC using Firefox 2.0.0.11 on Mac OS X:

    Piers:

    Short answer: because IRB code is all executing under the same top-level eval binding most of the time. In JRuby, we treat eval scopes specially because they’re the only scopes that can grow at runtime. MRI just treats them as dictionaries.

    Long answer: evals in Ruby 1.8 construct a binding scope that’s a child of the current containing scope, if (that containing scope is not itself a binding scope && a binding scope has not already been created). If the former is untrue, the current containing scope (a binding scope) is used for the eval. If former is true but the latter is untrue, the eval uses the existing binding scope already created. This differs in Ruby 1.9, where evals usually execute under their own new scope every time.

    There’s particularly nasty eval/binding unit test I wrote for JRuby here…enjoy!

    http://svn.codehaus.org/jruby/trunk/jruby/test/test_eval_with_binding.rb

    All: Yes, Ruby code is statically parseable. In MRI, it does need runtime information in certain contexts like you’ve shown here, but that’s more a limitation of MRI’s parser than an explicit ambgiuity in the language itself. In JRuby, Ruby code can be parsed (and compiled) completely offline.

Leave a comment

Please read the comment guidelines before posting. Comments are Gravatar-enabled. Your email address will not be published.

To prove that you’re human, type human in the Bot check field.

Trying to post some program output or a long code sample? Please use a paste service and link to it instead.