Tuesday, February 14, 2012

Modeling the Traveling Salesman

I’m going to attack a well-known problem from the 1800 century in this post: Traveling Salesman Problem (TSP).
The objective in the traveling salesman problem is to find the shortest path through a network graph visiting every node/vertex exactly once and returning to the origin, such a path is called a tour of the graph (or a Hamiltonian cycle if you are into computer science). The problem is NP-hard to solve, meaning it is likely that no feasible exact solution exists for large graphs. There exist a lot of variant on the traveling salesman problem. I will consider the form of the problem in which the graph lies in a plane, also known as the Euclidian TSP. This puts constraints on the distances between nodes and is a symmetric formulation of the problem. But still, this is an NP-hard problem.
To put things into perspective, the number of distinct Hamiltonian cycles through a graph is:
Enumerating all possibilities becomes infeasible for even small graphs. For n = 21 the number of solutions is 1,216,451,004,088,320,000. For the case of n = 101 the number of solutions is approximately 9e+157 which exceeds the approximated number of atoms in the known universe. Of course, I’m not claiming to solve this problem – actually I’m not trying to solve it at all. Many better people have tried that before me. I’ll however focus on testing a TSP solver. It may be exceptionally difficult to solve the problem, but it is possible to construct graphs for which the solution is known (and non-trivial) without too much effort. A model for that could look like:
I plan on doing a series of posts on this problem. In this first post I will define the problem and the method in which I’m constructing a graph for the TSP. In the following posts I will investigate different models for growing TSP graphs, and look into how they perform.

Growing graphs
The intuition behind my approach is to grow the graphs one node at a time. Under certain circumstances the shortest tour can be grown by adding another node causing it to still be optimal in the new graph. Using this approach we start out with a single point that grows to a line, then a triangle. At the triangle stage we have our initial shortest tour. The next step is to splice in one additional node (v’) into the graph – this concept is illustrated here:

A splice si is defined as replacing the edge ei with the two edges that form by connecting the endpoints of ei with v’. The cost of a splicing is defined as:
Where |x| is the Euclidian length of x. The cost is simply the extra length added to the path.
The algorithm for growing the graph by one vertex goes as follows:
Given a graph G = (V,E) where E is an shortest tour in G. Consider any node v = (x,y) not in V, define the minimum cost splice over all feasible edges E* constructed from vertex pairs in G (note E is a subset of E*):
Add v to the graph G if and only if the edge argument minimizer of d is contained in E and:
The resulting graph G’ = (V’,E’) is constructed by adding v to V and splicing ei in E.
Claim: The resulting edge set E’ in G’ is a shortest tour. I will post a mathematical proof of this later in the series.
As the constructed output graph G’ satisfies the conditions for G it follows by induction that we can repeat the growth process.
Time for some modeling!
Modeling the Traveling Salesman in Spec Explorer
The first very crude model[1] I’m going to demonstrate will build a graph on a fixed-size integer lattice, where nodes can be added only at the grid points. This implementation may seem very limited, but in fact there is not much difference in modeling an integer lattice versus a floating-point domain, because any integer lattice graph can be scaled to match any graph on a floating-point domain (in-fact a floating-point domain is just a really large integer interval). However, when we constrain the size of the lattice we are of course limiting the flexibility of the model, but the integer representation has some advantages.
Because we are working on a fixed size lattice we can convert any (x,y) point to an integer index. This index is useful when storing the representation of the current graph, because vertices can be held in a SetContainer<int> allowing the model to be independent of the order in which the vertices are added. The same applies to for edges in the shortest tour of the graph.

The model contains just two rules:
    action abstract static void TravelingSalesman.InsertVertex(int x, int y);
    action abstract static double TravelingSalesman.Solve();

The InsertVertex attempts to grow the current graph by insert a vertex at (x,y) and then checking whether this vertex satisfies the criteria of our growth algorithm.
We start out by exploring a tiny grid (2x2):
config Grid2x2: Main
    action abstract static void TravelingSalesman.InsertVertex(int x, int y)
        where x in {0..1}, y in {0..1};

machine TSM2x2() : Main where ForExploration = false
    construct model program from Grid2x2
    where scope = "TravelingSalesManModel.TSMModelProgram"

machine TSM2x2NoSolve() : Main where ForExploration = true
    InsertVertex* || TSM2x2

Exploring this machine results in:

That is 16 distinct graphs with corresponding shortest tour values, which can be feed to a TSM solver. Of course these graphs are far from challenging, but it proves that we are able to grow graphs on a limited size domain. Also it serves to highlight the number of redundant graphs indicated by the high number of cross-over steps. Going just to a 3x3 lattice we immediately run into problems… There exist a ton of possible graphs we can grow. The largest grown graph has 8 nodes out of the maximum of 9. But let’s try instead to slice the model and only generate graphs starting from the center point of the lattice. We can easily do this on a 3x3 grid using:
config Grid3x3: Main
    action abstract static void TravelingSalesman.InsertVertex(int x, int y)
        where x in {0..2}, y in {0..2};

machine TSM3x3() : Main where ForExploration = false
    construct model program from Grid3x3
    where scope = "TravelingSalesManModel.TSMModelProgram"

machine TSM3x3CenterScenario() : Main where ForExploration = true
    (InsertVertex(1,1) ; InsertVertex*) || TSM3x3

This machine slices the model by forcing the choice (1,1) for the first vertex. Exploration results in:
This is an extremely complicated solution. The problem here is of course that solution space (the number of applicable graphs) is itself extremely large, and because we are doing Model-Based Testing we are actively seeking to construct any feasible solution from this space.
It is fairly easy to construct a large set of test cases (this model constructs 240 test cases) from this problem space. They all pass against my crude implementation of a TSM solver. The problem however, is that they are not really taxing the implementation because they are all very small graphs, inside a tightly constrained grid. We can of course easily increase the size of the grid, however, the number of small graphs explodes even more.
We have seen how a model can be designed to grow graphs for the Traveling Salesman. However, the crude model struggles already on tiny input domains and produce a large number of uninteresting test cases. In the coming posts I will, formally prove that the algorithm for growing the graphs works. Furthermore, I plan on investigating how we can further slice the crude model to generate graphs of higher complexity.

No comments:

Post a Comment