Background
The Shunting Yard Algorithm, written by Dijkstra, converts an infix expression into a post-fix expression. For example, the expression:
\(3 + 4\) | becomes | \(3\;4\;+\) |
Precedence and Associativity
What makes this difficult is that operators, generally, have some notion of precedence in most languages. For example, multiplication typically has a higher precedence than addition. So the expression 3 + 5 * 6 is 33 (multiplication happens first) rather than 40.
There’s also the issue of associativity. For example, 3 + 4 + 5 happens left-to-right, or it is left-associative. On the other hand, assignment is right-associative. For example, a = b = c = 4, all values receive the value 4 because c = 4 is performed first, while b is assigned to theresult of the assignment of c, which is 4.
For the purposes of this exercise, consider these precedence rules from C++.
Note: This kata is in flux, I’ve added the link to the C++ operators but the examples use ^ as “to the power of” rather than bitwise negate. Follow the examples until I update the list of tests.
Examples
Here is a series of examples with expected results:
title: Katas.ShuntingYardAlgorithm.examples —
Example Infix Expression | Expected Postfix Result | Notes |
empty string | empty string | A good place to start, you’ll create the basic translator class and get a value back. |
null | empty string | Make sure you handle one error case, a null string, by returning a reasonable default value of an empty string. |
45 | 45 | Make sure you can handle a simple, literal number. This test may work out of the box, if not, it should be simple to get it to work. |
+ | + | Make sure you can handle a simple operator. This test will probably work out of the box after getting the previous test to pass. Later on, this test represents a error case of sorts since there are no operands for this operator. |
3 + 8 | 3 8 + | Your first “real” test, can you perform some basic translation? |
2 + 9 - 6 | 2 9 + 6 - | You probably hard-coded the first example, this makes you do a little more work. And it adds a second operator to boot. |
2 + 9 * 6 | 2 9 6 * + | This is the first example where operator precedence makes a difference. |
2 * 10 ^ 6 | 2 10 6 ^ * | The ^ (raised to the power of) operator is typically a right-associative operator. |
2 ^ 3 ^ 4 | 2 3 4 ^ ^ | This was a bit of a dud test I added. It immeidately passed. Might consider leaving this one out. |
a ^ 3 | a 3 ^ | I want my expression evaluator to handle symbols as well as literal numbers. This example forces that behavior. |
(3 + 4) | 3 4 + | Handle ()’s. First, simply remove them. |
(3 + 4) * 5 | 3 4 + 5 * | Now make sure that ()s change the order of evaluation since + typically happensafter *. |
(3+(4-5))*6 | 3 4 5 - + 6 * | A more complex example to verify that the solution is general. Also notice that the spaces between separate tokens has been removed. You’ll need to handle both with and without spaces. |
f(3) | 3 f | This is an example of a function call. Notice, this forces “look ahead” in your algorithm because you do not know if something is a function or a variable unless you look ahead to the next token. |
f(g(4)) | 4 g f | Is your solution general enough to handle nested functions? |
f(3, 4, 19) | 3 4 19 f | Can your solution handle multiple parameters to a function call? |
3 + 4 * 2 / ( 1 - 5 ) ^ 2 ^ 3 | 3 4 2 * 1 5 - 2 3 ^ ^ / + | This example is taken from the Shunting Yard Algorithm writeup on wikipedia. |
f(4+5,1+a^2,(8+b)*10) | 4 5 + 1 a 2 ^ + 8 b + 10 * f | I added this some time later. It incorporates just about everything already done and it specifically pushes handling multiple parameters to a function correctly. I expected this to take some time, but it really did not take much (a few minutes). |
One Approach
The examples above are taken from a recent (June 2009) personal walk-through of the algorithm. I was surprised by my results and the ease with which I was able to implement the algorithm.
I recommend the following approach when trying this problem:
- Use the examples with the expected results shown above
- Work top to bottom in the order shown
- Addone example at a time and get all tests to pass
- Review both your test code and production code, refactoring as you see fit
One note, when I worked on this, I used Java with JUnit 4’s parameterized tests. Here’s what the final test file looks like:
This worked OK, but occasionally I wanted to run just one test. This does not seem to be supported, so I had to comment out various lines in thedata() method. The next time I work on this, I’ll probably use individual test methods.
Getting Started
Here is one possible start to this problem (with the first three tests added and passing):
The Test
Note: Occasionally I experiment with different ways of writing unit tests with JUnit. This is one such example where I attempt to simulate the domain specific language used by cucumber.
The Production Code
And here’s the results after the first three tests:
Hints
I take the approach of splitting the expression into individual tokens and then processing them one by one. To do this, I use regular expressions and to format the output, I use String.format(…). Here are a few hints along those lines:
- “3 + 4”.split(“\s”) - slits the string into an array: {“3”, “+”, “4”}.
- “3 + 4”.split(“\s+”) - same as above, but allows for 1 or more spaces.
- String.format(“%s %s %s”, “3”, “4”, “+”) –> “3 4 +”. Consider using String.format to format your output.
- When searching for operators using regular expressions, remember to escape them. E.g., “\+”. You can generally escape any character.
- Eventually, splitting the expression will be complex enough that it will warrant its own class with its own micro tests.
- The operators have difference associativity and precedence. Eventually, you’ll probably want to create a class for that.
- Eventually you’ll probably want a simple factory that can map a string into an operator.
- If you want to validate a number, you can use a regular expression or, in my case, the DecimalFormat class.
What We Accomplished
Here’s a zip of the source files we created during the dojo: dojo.zip
Here’s a version I’ve worked on a bit more: ShuntingYardAlgorithm.zip Previous
Comments