D2: Travelling Salesman Problem
After passing your D2 examinations, you decided to become a Travelling Salesman. This involves starting from home and visiting all the towns in your manor and then returning home. Here is a map of your manor (you live at E).
The Travelling Salesman Problem is then... what is the minimum distance you must travel in order to visit each town and return home?
Note that this is different to the Chinese Postman Problem where the task is to go down every arc once. You probably can find the best route for this map but for a larger map with loads of towns there is no simple algorithm. I say simple because obviously we could find every possible way of visiting each town and select the shortest overall route; but unfortunately this method quickly grows to zillions of possible routes (it grows like n!).
Upper & Lower Bounds
Because there is no known algorithm that is simple enough to use, we try to find lower and upper bounds. The lower bound is the minimum that we have to walk and the upper bound is the maximum. If we find a lower bound of 23 and an upper bound of 25, the the answer must be between these two numbers. Hopefully it is clear that we want the lower bound to be as high as we can find and the upper bound to be as low as we can find. Even better if they are the same number as that must be the answer to the problem.
Upper Bound
There are two ways to quickly find a good upper bound. The upper bound is the most that we have to walk. The first way is to find a minimum spanning tree.
Now start at home (E) and go to each town before returning home but for now I insist that you use only the arcs on the minimum spanning tree. What do you notice?
If we draw the minimum spanning tree on a straight line, it sometimes helps to see what shortcuts we could take.
If we went from E to A to B to C to D, we could return straight home to E and save time.
Sometimes it's not so obvious where the shortcut is. There might be a few to consider and this causes problems. We'll see another method for finding an upper bound later.
Lower Bound
Now we shall seek a lower bound. Here is our town map with the shortest route marked.
Imagine now that town A didn't exist. The red route for the new map without A is just the minimum spanning tree for the remaining vertices. Now imagine that B doesn't exist instead. What can you see? Try for the other vertices.
If we now reconnect the vertex A using the two shortest lengths, we get the best (red) route. However if we did the same thing for the vertex B, the minimun spanning tree for what's left is CD DE EA (= 43 in total). When we connect B to this by the two shortest arcs (BC BD), the total is smaller than the red route.
This gives us a method for finding a lower bound.
- Delete a vertex and find the minimum spanning tree for what remains
- Reconnect the vertex you deleted using the two smallest arcs
- Repeat this process for all vertices and select the highest as the best lower bound
But we can shorten this whole process. If in part 1 we find a minimum spanning tree that is just a straight line (like when we delete A, the minimum spanning tree is BCDE) and when we reconnect the deleted vertex (A) it joins to the two ends of the line to make a circuit (ie B and E), then the result is a lower bound that actually is a proper route. Think about it - we have found a route that is equal to the lowest that the best route can be. We have found the best route!
The Nearest Neighbour Algorithm
Another way of finding an upper bound is to use the Nearest Neighbour Algorithm. This is a basic, common sense algorithm. You start at home and travel to the closest town that you have not yet visited. When you have visited all of the towns, you return home directly.
It is the last bit that can prove to be a long journey - the trip home. For this reason, this algorithm is not perfect but it is straightforward and can be done on a matrix - like Prim's Algorithm but you only consider the latest column of numbers.
Practical vs Classical Problem
In the matrix in the video clip, every town was connected directly to every other. This is called a complete graph. In reality, this is not always the case (as in the example in the diagrams above). This is a difference between a classical TSP and a practical one. The other difference is that in reality it might not always be shorter to go directly from one town to another - sometimes it's shorter to go through another town to get there. This last bit is called the triangle inequality and always holds for the classical problem but not always for the practical one.
In the diagram above, it is quicker to go from A to C to B (4+2=6) than it is to go from A to B (7). You can't draw a triangle with these lengths (try it if you don't beleive me!) so the triangle inequality does not hold for this bit of graph. This must be a practical rather than classical problem.
Converting Practical to Classical
In reality, not all town are connected directly and the triangle inequality might not work (this is because even direct roads might curve around a bit). We can convert such a problem into a classical problem by doing 2 things...
- If there is not a direct route between 2 towns, pretend there is by using the shortest indirect route
- If the direct route is shorter than an indirect route, replace it with the value of the indirect route. This is because you'd never go direct if the indirect route was shorter.
The following graph is not a classical one as both rules are broken
In matrix form this is...
A | B | C | D | |
---|---|---|---|---|
A | - | 7 | 4 | 5 |
B | 7 | - | 2 | 4 |
C | 4 | 2 | - | - |
D | 5 | 4 | - | - |
- There is no route from C to D. The shortest indirect route is 6 so add that to the matrix
- It is quicker to go from A to C to B than it is to go directly, so replace the 7 in the matrix with the shorter route (6).
A | B | C | D | |
---|---|---|---|---|
A | - | 6 | 4 | 5 |
B | 6 | - | 2 | 4 |
C | 4 | 2 | - | 6 |
D | 5 | 4 | 6 | - |
Now we solve the problem using the usual methods for the classical problem (ie find upper bound, lower bound &c.)
When we have solved the problem, remember to say what the actual route is and not the pretend route we've just made up. In the practical problem you often revisit towns - something that cannot happen in the classical problem.