Previously, I had pretty much avoided implementing any kind of operator precedence for Brat and covered up my reluctance by saying that was just how it was. So binary operations were strictly left-to-right affairs. But that makes things like 1 + 3 * 4 very disappointing. Since I keep feeling like Brat should at least pretend to be a real language, I decided I really needed to do something about this.
What is so hard about it? It doesn’t seem difficult at first. It seems like there ought to be a nice recursive solution to the problem. And there probably is, but if you are using a typical recursive descent parser, this recursion will not be possible.
There are a few ways to go at this issue, but I wanted the simplest and easiest to deal with. A while ago, I had been thinking about this and ran across a reference to the shunting yard algorithm. At the time, I couldn’t really see how it was related. This time around, I looked at it a bit more closely.
The purpose of the algorithm is basically to convert equations which use infix notation (typical math notation with operators in between their arguments) into Reverse Polish Notation. Why would you want to do that? Well, because RPN is explicit about which order operations are evaluated in. It is also pretty straight-forward to implement (which is the important part, really.)
Let’s take a look at the two main parsing rules, written in Treetop, which deal with binary operations. I have elided the related code:
rule binary_operation
lhs:binary_operation_chain rhs:expression { ... }
end
rule binary_operation_chain
(lhs:binary_lhs_expression ~space op:operator ~space)+ { ... }
end
The first rule will match any binary operations. The second rule is a helper, which matches a list of binary operations, all except for the far right expression. There may have been some explicit reason why I separated these before, but it remains convenient in the code to do so.
The second rule will essentially return a list of alternating values and operators, just what the shunting yard algorithm needs.(Well, actually it’s simpler than the algorithm needs.) This list will be fed into the algorithm, which will spit out a new list of values and operators, now in the proper RPN form.
The parser then “evaluates” the RPN expression. It essentially runs through the algorithm for evaluating RPN, outputting code to evaluate expressions and using temporary values. In other words, the RPN algorithm is not explicitly emitted in the code, but the intent is in the order that things are evaluated.
The whole thing ends up being about 50 lines of code, which is far less than I thought I would be looking at for this.
Anyhow, just wanted to share. I’m pretty excited that Brat has this feature now, especially since it can be implemented in such a flexible way. Operator precedence is stored in a hash table which just associates operators with an integer representing its precedence level, so it is quite simple to add more operators (operators not in the table are just given the lowest precedence, though, so arbitrary operators are still completely supported).

