Jun 212006
 

Okay, I’ve been using ruby for a while now and I really like it, so don’t flame me. BUT! Ruby appears to be a single pass parser. This means if you have a class that uses another class in the same file, you have to define them in the correct order. Yeah, this is something that most languages have fixed a long time ago using a two pass parsing strategy where you collect the symbols in the first pass and then verify and compile in the second pass (or some variation on that). So, I’m annoyed a bit at Ruby for missing the boat on that one. Hopefully they fix this.

One solution is to define the class is a separate file and use a require to include that. This seems to be the best solution at the moment and will support modules as well, but really Ruby should handle both types of declarations regardless of ordering if they intend to let you define multiple classes in a single file.

  6 Responses to “Ruby is a single pass parser! Eeck!”

  1. Ruby is an interpreted language. Therefore you’ll have problems when you evaluate code that causes the evaluation of unknown symbols. The code below runs fine, though.

  2. I seem to have broken your comment feature with some ruby code. But you can trust what I wrote. :)

  3. Comment fixed somewhat. The code still isn’t complete, so I’m not sure what it did exactly. You should definitely use PRE tags rather than CODE tags for code snippets.

    Whether or not a language is interpreted or not, the parser still has to parse an entire file before execution. It needs to build the AST. Ruby doesn’t parse the entire file but appears to only parse the file until it see the class in question and then stops. Meaning that the AST doesn’t have all the symbols from the file (eeck!). This could be a performance thing, but I doubt it.

    I think the code snippet you supplied illustrates two things and both are runtime not parse time (it got truncated so I’m not certain). First that ruby class variables can be declared anywhere without problems because the class is not fixed and you can add new variables and methods at anytime. And second that a class is always initialized by calling the initialize method.

  4. The code I posted was just an example of referring to another class before its definition without any problem.

    An interpreter doesn’t necessarily have to parse the entire file. Typically, they read a line, then evaluate it. That is why that code blew up. Here’s a shorter code sample to illustrate that it isn’t ruby’s problem; it’s not a problem at all, really.

    # python
    f()
    def f():
    print “f() called”

  5. Hmmm not sure how you got that to work if the classes are in the same file. Like this blows chunks for me:

    I get this error

    The issue is that in order to parse this code the interpreter starts at the top of the file and starts parsing. It encounters the class Test1 and probably (I haven’t looked at the code) builds a complete AST/symbol-table for it. Next it sees the plain line of code and parses it building another AST. Since that line of code belongs to the global space it is executed. This in turn executes the Test1 method called foo, which tries to create Test2. Since the parser never got to the Test2 definition, explosions. If ruby used a two pass parser it would build the AST/symbol-table for both classes and the global scope and then execute the global scoped code. Then when it encountered the Test2 reference it would already have the AST/symbols for it and could execute it.

    Of course I’m guessing a lot at the implementation, but most programming languages regardless of compiled or interpreted build symbol tables and syntax trees and all that jazz before anything is executed.

  6. The thing about ruby is that it’s mostly interpreted line-by-line in a very naive manner. This makes it possible to use things like attr_accessor/include/private/module_function.

    So, going through it line by line gives you something like: http://p.ramaze.net/17497 – that only covers a small part of what’s made possible by this approach. What is most important in my opinion, is the fact that every class is just another object assigned to a constant.

    Some people have experimented with using const_missing, which returns a symbol of the requested constant and resolves it at a later point when needed, but that wasn’t too successful.

 Leave a Reply

(required)

(required)

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">