| Rethinking Mixed Integer Model Formulations - Part 2 |
|
Part 2: Avoiding symmetry
This article addresses ways to reduce the symmetry of a MIP—reducing the number of equivalent solutions to reduce the amount of work needed in the branch and bound process.
One way to reduce symmetry is to formulate with patterns instead of assignments. The following are examples of two different formulations for the same problem, one using assignments, the other using patterns.
There are twenty-five trucks separated into 5 types, differentiated by cost and by capacity. The type capacities are 100, 80, 60, 50, 40 and their respective costs are 98, 76, 60, 50, and 40. Twenty loads must be assigned to the trucks, and loads cannot be divided between trucks. The load sizes are 60, 40, 50, 70, 20, 10, 40, 50, 70, 80, 20, 10, 20, 40, 30, 40, 30, 50, 60, and 50.
For the assignment formulation, number the trucks from 1 to 25 and denote their capacities by uj and their costs by cj, where j = a number from 1 through 25. Denote the load sizes by ai, where i = a number from 1 through 20. The objective is to find the minimum cost assignment of loads to the trucks.
The decision variables are xij, where xij = 1, indicating that load i will be placed on truck j and yj = 1, indicating that truck j is used. The assignment formulation is as follows:

The first set of constraints enforces the capacity of each truck. The second set of constraints and the binary nature of the xij variables enforces that each load must be assigned to exactly one truck.
The symmetry in this problem arises in two ways:
- There are multiple trucks of each type and assigning a load to a given truck is no different than assigning it to a different truck of the same type.
- There are multiple loads of the same size, and assigning one to a given truck is no different than assigning another of the same size to a given truck.
To see how the symmetry affects the branching process, consider what happens when an lp sub-problem splits a load between two trucks ,. For example, if load 1 is split between trucks 1 and 2. If you branch down on the variable associated with load 1 going on truck 1 (set the variable to zero), the best way to reassign the load is to re-assign the partial load to one of the other trucks of the same type. With five trucks of each type, there will be theee successive branches that give the same objective value and the same number of infeasibilities. No progress is made until the fourth branch.
There are two ways to improve this formulation: use either additional constraints or priorities. You may add constraints that will order the use of the trucks within a type:
y1 => y2 => y3 => y4 => y5
y6 => y7 => y8 => y9 => y10
Note that there is a constraint for each truck type when done this way. These constraints will not tighten the bound on the best possible integer solution, but it will make sure that truck i is always assigned before truck i+1, preventing unnecessary branches that would arise by swapping identical trucks. There is no restriction between the different truck types.
An alternative approach is to use priorities to alleviate the effect of the symmetry—give each variable for a truck type a different priority. Giving different priority values for trucks of a given type differentiates the trucks in the branching process so the tendency will be away from a series of similar branches.
Another way to avoid symmetry is to use a completely different formulation. A formulation that avoids symmetry will generate all the feasible loading patterns for each type of truck, and choose which loading patterns to use in which quantities. Consider a pattern formulation:
A pattern vector is a vector that has an integer in row n, representing how many loads of the nth size are contained in the loading pattern. A pattern vector dkl represents the lth possible loading pattern for a truck of type k. Pattern vectors will be as long as the number of loads. This example will show 8 loads. The feasible loading patterns for a truck with a load size 40 are: (40), (30, 10), (20, 20), (20, 10, 10) and (10, 10, 10, 10). The corresponding pattern vectors dkl are therefore (sizes 10 to 80 in rows 1 to 8): (0 0 0 1 0 0 0 0), (1 0 1 0 0 0 0 0), (0 2 0 0 0 0 0 0), (2 1 0 0 0 0 0 0), (4 0 0 0 0 0 0 0). Assume there are Lk loading patterns for each truck type k, with k = 1 through 5. Only full-capacity load patterns need be generate, such as those with loads sums to capacity for truck type k.
The demand vector b, has entries which are the number of loads of the nth size required for this example the entries are (2, 3, 2, 4, 4, 2, 2, 1).
There is a decision variable associated with each pattern vector vkl, which is an integer variable and represents how many trucks with this load pattern are used. The reformulated model is as follows:
The first group of constraints uses both the pattern vectors and the demand vector to enforce that the demand is at least satisfied. Since only full-capacity patterns were used, the “greater than” inequality is used.
The second set of constraints sets the limit on the number of trucks of each type to the number of trucks available. For this example there are 5 trucks of each type. The cost gk is the cost of using a truck of type k.
The solution gives the number of trucks using each loading pattern to be used. The loads can then be assigned to trucks. If there is excess capacity in the solution for any of the load sizes, some of the trucks will not be full, but the solution will still be valid.
When branching down on a pattern variable, another pattern vector must be selected. The cost may be the same, but the infeasibilities will be different. The branching process will make progress at every step. A pattern formulation also has the benefit of having LP feasible regions that are closer to the MIP feasible region, and that is the case for this example.
A drawback of the pattern approach is that the number of patterns is combinatorial, and grows rapidly. To handle this drawback turn to the column generation approach, where the pattern columns are not generated all at once; but only during the solution process when it looks like a pattern can be used to obtain a better solution. A column generation algorithm is called a decomposition algorithm.
This topic was inspired by the tutorial given by Professor George L. Nemhauser of the Georgia Institute of Technology at the 15th International Symposium on Mathematical Programming. Further discussions of symmetry in formulations appear in the books Integer and Combinatorial Optimization, G. L. Nemhauser and L. A. Wolsey, John Wiley & Sons, New York, 1988 and Model Building in Mathematical Programming, 3rd. ed., H. P. Williams, John Wiley & Sons, Chichester, 1990.
|