The Hungarian Job AssignmentDate: 03/10/2011 at 04:17:34 From: Agan Subject: assignment problem. I have n people and n machines for them to operate. Each person operates a single machine and commands a different wage for operating that machine. How can I assign the machines among my employees so that I have to pay them the least money in total? I have a solution from crossing different rows and arranging them in a specific order, but I need to understand the process. I think I have to consider all the possible combinations, which is difficult for even a modest value of n. This is a practical problem, but it looks like it has no easy answer. Please help! Date: 03/11/2011 at 03:21:18 From: Agan Subject: assignment problem. Sorry I didn't explain that properly. Let me try again with an example. I have 3 engineers (A, B, and C) and 3 projects (P, Q, and R) to do. Here is the money each wants to undertake each project: P Q R +----------- A | 15 18 18 B | 17 23 17 C | 20 18 17 As the manager of the engineers, what project should be given to whom to minimize my total payroll? How do I solve this scenario for larger and generalised cases? Date: 03/13/2011 at 06:48:23 From: Doctor Jacques Subject: Re: assignment problem. Hi Agan, This problem can be solved by linear programming, but there is a much more efficient method, called the Hungarian algorithm. We start with an n*n matrix with integer entries representing the costs of the assignments (you can use negative costs to represent profits). The goal is to select one entry from each row and column to make the total cost minimal. As an example, we will use the following cost matrix: 1 2 3 4 5 +---------- A | 8 3 5 4 3 B | 2 6 9 4 7 C | 6 1 8 4 3 D | 5 7 9 8 8 E | 5 7 9 4 3 A first observation is that, since you must use one entry in each row, the optimal assignment is not changed if you subtract the same number from all the costs in a row. The total cost will not be the same, but once you have found the optimal assignment, you can go back to the original table to find the corresponding cost. Just keep track of the adjustments you make. In this case, we subtract from each row the smallest cost of that row. This gives the table: 1 2 3 4 5 +---------- A | 5 0 2 1 0 B | 0 4 7 2 5 C | 5 0 7 3 2 D | 0 2 4 3 3 E | 2 4 6 1 0 We can now do the same thing with the columns: 1 2 3 4 5 +---------- A | 5 0 0 0 0 B | 0 4 5 1 5 C | 5 0 5 2 2 D | 0 2 2 2 3 E | 2 4 4 0 0 By construction, this table only contains non-negative entries, and it contains at least one 0 in each row and column. As stated before, the optimal assignment for this table is the same as for the original table. In particular, since any assignment in this table will have a non-negative cost, an assignment that uses only cells containing 0 will be optimal, if such an assignment exists. The algorithm will consist in the repeated execution of the following steps: 1. Try to assign as many rows as possible using only the 0 cells. 2. If we have assigned all the rows (5 in this case), we are done. 3. Try to modify the assignments to assign more rows, without modifying the table. If this is successful, go back to step 2. 4. Modify the table (in such a way that the optimal assignment is not changed) to create more 0 cells, possibly removing unassigned 0 cells (but not the assigned cells), and go back to step 2. Before going into the details, I would like to outline the ideas involved. In step 3, we try to assign more 0 cells without changing the table. Let us assume that a row is currently not assigned. We know that this row contains at least one 0, because of the initial steps. We could not assign that 0, because there was already another 0 assigned in the same column. We can try to displace that 0 to another column of the same row; this may now create a conflict in the new column. We try to repeat the process until we reach a currently unassigned column. If we end up in such a column, we can move all the assignments in the path, which will allow us to assign another 0. In step 4, we will try to cover all the 0s with the smallest possible number of rows and columns. As a result, the cells that are not covered will all be strictly positive, and we can modify the table to create at least one additional 0 in these cells. It is a consequence of a theorem in Graph Theory that the smallest number of rows and columns needed to cover all the 0s is equal to the largest number of 0s that can be assigned together. The method I will describe is actually based on some ideas used in the proof of that theorem, and the nice thing is that the same computation will produce the data needed for step 3 and step 4. We start with step 1. In this case, we will simply look at each row in turn, and assign the first 0 cell in that row that is not in a currently assigned column, if there is such a cell. Marking them with asterisks, this gives the following assignments: 1 2 3 4 5 +-------------- A | 5 0* 0 0 0 B | 0* 4 5 1 5 C | 5 0 5 2 2 D | 0 2 2 2 3 E | 2 4 4 0* 0 If you solve a small problem by hand, you will probably use various heuristics to get a better assignment; for example, you could start with the rows that only contain one 0. I did not do that in this case, in order to show you the general method while keeping the example small enough. Currently, we have only assigned 3 cells, and we are not done yet. To execute step 3, we will need to keep track of the cells we have visited; this will allow us to retrace our steps if we find a way to change the assignments and assign a new cell. In order to do that, we need to include additional information in our table. We attach to row 'i' a row tag rt(i). This field can contain the following information: i) 0 (or blank) to represent the initial state. ii) -1 if the row is not assigned. iii) >0 if rt(i) contains the number of the column where a conflict was detected and caused the assignment of this row to be changed. We also attach to column 'j' a column tag ct(j) that contains the following information: i) 0 (or blank) to represent the initial state. ii) >0 if ct(j) contains the number (I will use letters instead) of the row in which we tried to assign a cell in this column. We can also add to each row and column a flag to indicate that the corresponding tag was modified during the last pass; this will save some work (I do not show these flags here). We have some general rules: * We make passes alternately on the row tags and the column tags (starting with the rows); in each pass, we look only at the tags that have been modified since the last pass. * We only change tags that are in their initial blank state; once a tag has been modified, it is never modified again in the same run of step 3. Now, let us look at the details. In a row pass, we look at all the row tags that have been modified since the last pass. If the tag of row 'i' has been modified: * We look at all the unassigned 0s in row 'i' (there may be none). * If (i,j) is such a 0, and if ct(j) is blank, we let ct(j) = i and we mark ct(j) as modified. When all rows have been processed, we turn to a column pass. In a column pass, we look at the column tags that have been modified since the last pass; if there is none, step 3 terminates unsuccessfully (this cannot happen for row tags). If the tag of column 'j' has been modified: * We look at the assigned 0 in column 'j.' If there is none, we have found a way to assign one more cell. We do this by retracing all the operations that led to this column (using the row and column tags), and moving the assignments in the path. This will be explained in more detail using the example. * If, on the other hand, cell (i,j) contains an assigned 0, and if rt(i) is blank, we let rt(i) = j, and we mark rt(i) as modified. When all columns have been processed, and if step 3 is not yet finished, we go back to another row pass. To allow you to follow the explanations, I will show here the table after the execution of step 3: 1 2 3 4 5 rt +---------------+-- A | 5 0* 0 0 0 | 2 B | 0* 4 5 1 5 | 1 C | 5 0 5 2 2 |-1 D | 0 2 2 2 3 |-1 E | 2 4 4 0* 0 | +---------------+-- ct| D C A A A | We start by clearing all the row and column tags; we then let rt(i) = -1 for the currently unassigned rows (C and D). In the first row pass, we look at the modified rows (C and D). Row C contains an unassigned 0 in column 2: we let ct(2) = C (if ct(2) had not been blank, we would have left it alone). In a similar way, row D contains an unassigned 0 in column 1, and we let ct(1) = D. We now start a column pass, by looking at the modified columns 1 and 2. In column 1, we have an assigned 0 in row B; we therefore let rt(B) = 1. In the same way, we let rt(A) = 2 because of the assigned 0 in A2. We start another row pass, looking at the modified rows A and B. In row A, we have unassigned 0s in A3, A4, and A5; accordingly, we let ct(3) = ct(4) = ct(5) = A. In row B, we have no unassigned 0, and we do nothing. We start a new column pass, on the modified columns 3, 4, and 5. In column 3, we have no assigned 0. This means that we can use that column for a new assignment. We can now retrace our steps: * We are in column 3. rt(3) = A, which means that we can assign the cell A3, and we must look at row A. * We are now in row A. Since we have assigned a new cell in A3, we have a conflict. The position of the conflict is given by rt(A) = 2: we must remove the assignment in A2 and move to column 2. * Since we have removed the assignment in A2, we must assign a new cell in column 2. The position of that cell is given by ct(2) = C. We assign cell C2 and move to row C. * Since rt(C) = -1, we are done. We have changed assignments on the path A3 -> (A2) -> C2, where the parentheses are used to identify removed assignments. The new table is as follows (I have also included the results of the next execution of step 2): 1 2 3 4 5 rt +---------------+-- A | 5 0 0* 0 0 | B | 0* 4 5 1 5 | 1 C | 5 0* 5 2 2 | D | 0 2 2 2 3 |-1 E | 2 4 4 0* 0 | +---------------+-- ct| D | We now have 4 assignments, and the job is not yet finished. We execute step 3 again: * Row D is unassigned, and has a 0 in D1. We let ct(1) = D. * Column 1 has an assigned 0 in B1. We let rt(B) = 1. * Row B has no unassigned 0. Since no column tags have been modified in the last row pass, step 3 terminates unsuccessfully. We now define a set of rows R and a set of columns C in the following way: * R contains the rows where rt(i) is blank; in this case, R = {A, C, E} * C contains the columns where ct(j) is not blank; in this case, C = {1} We write r and c for the number of rows in R and columns in C, respectively. We can now make a few observations, and it is possible to prove them by thinking (hard) about what happened in step 3: * Each 0 (assigned or not) is in R or C (or both): R and C cover all 0s. * Each assigned 0 is in R or C, but not both. This confirms that r + c is equal to the number of assigned 0s (4 in this case), as stated before. A consequence of this is that the uncovered cells (the intersections of rows {B,C} and columns {2,3,4,5}) only contain strictly positive entries. Let d be the smallest of these entries (in this case, d = 1, corresponding to cell B4). We now modify the table as follows: * We subtract d from each uncovered cell. By the definition of d, these cells will remain non-negative, and at least one of them will become 0. * We add d to each cell that is covered twice (by a row in R and by a column in C). As the unmarked 0s are only covered once, this will not disturb them. Some of the unassigned (and therefore unused) 0s may disappear, however. This modification does not change the optimal assignment, because it is equivalent to subtracting d from the uncovered rows and adding d to the covered columns. This results in a new table (in which I already included the results of the next application of step 3): 1 2 3 4 5 rt +---------------+-- A | 6 0 0* 0 0 | B | 0* 3 4 0 4 | 1 C | 6 0* 5 2 2 | D | 0 1 1 1 2 |-1 E | 3 4 4 0* 0 | 4 +---------------+-- ct| D B E | As the assigned cells have not been modified, we can keep them assigned; there is no need to execute step 1 again. We see that we have created a new 0 in B4; however, we cannot use it directly, because of the cells B1 and E4. We execute step 3 again. I will let you do that; the results are in the table. Step 3 ends in column 5, where there is no assigned 0. This means that we can improve our assignment, using this path: E5 -> (E4) -> B4 -> (B1) -> D1 Though it is not complete, this is the final assignment: 1 2 3 4 5 +--------------- A | 6 0 0* 0 0 B | 0 3 4 0* 4 C | 6 0* 5 2 2 D | 0* 1 1 1 2 E | 3 4 4 0 0* Going back to the original table, we find that the total cost of this assignment is: 5 + 4 + 1 + 5 + 3 = 18 We could also have kept track of the adjustments in the course of the algorithm: * The row minima were 3, 2, 1, 5, 3, for a total of 14 * The column minima (in the second step) were 0, 0, 2, 1, 0, for a total of 3 * In step 4, we subtracted 1 from two rows (B and D) and added 1 to one column (1), for a total decrease of 1. This confirms that the total cost was decreased by 14 + 3 + 1 = 18; as the total cost is 0 in the final table, the original cost was 18. Note that, after step 4, step 3 allowed us to assign one more cell. This is not always the case. It may happen that you have to execute step 4 more than once before you can assign a new cell; the results will be different, because the table is modified each time you execute step 4. However, the algorithm will always terminate after a finite number of steps. To see this, notice that each time we execute step 2, either we assign a new cell (which can only happen a finite number of times), or we execute step 4. In step 4, we subtract d from (n - r)(n - c) cells, and we add d to r*c cells; the total of the table is thus increased by: d(rc - (n - r)(n - c)) = dn(r + c - n) As noted before, r + c is equal to the total number of assigned cells, and this is less than n (otherwise we would have finished). This shows that the expression above is negative; the total decreases each time you execute step 4; as the total must be non-negative, this also can only happen a finite number of times. Please feel free to write back if you require further assistance. - Doctor Jacques, The Math Forum http://mathforum.org/dr.math/ Date: 03/15/2011 at 01:03:17 From: Agan Subject: Thank you (assignment problem.) Thank you for your detailed answer! Thank you, again! |
Search the Dr. Math Library: |
[Privacy Policy] [Terms of Use]
Ask Dr. MathTM
© 1994- The Math Forum at NCTM. All rights reserved.
http://mathforum.org/dr.math/