| 
 Problem:Suppose we are going to form a team to accomplish a mission.  Team members are selected from a group of candidates each of whom specializes  in several skills (such as reconnaissance, combat, medical support, etc). The  objective is to make the team as small as possible while satisfying that every  required skill is owned by at least one team member. And this is one situation  that needs an algorithm for set cover problem.
 Formally, the (un-weighted) set cover problem is addressed  as following [3]. We are given a set U of m elements (called the universe and assumed to be {1,2,…,m})  and n subsets S_1,S_2,…S_n where each S_i ⊆ U. Given that the union of all n subsets  equals U, the goal is to find out a minimum number of subsets so that the union  of these subsets still covers U.
 Solution: The problem is known to be NP-hard. Other than looking for  the optimal solution with exponential time, the following will introduce a \ln  n-approximate greedy algorithm with polynomial time complexity. The strategy is  that, at each step, one always selects the subset which contains the largest  number of uncovered elements. Algorithm 
        
          | Input: universe U = {1 , 2 , … , m }, subsets S = {S_1,S_2,…,S_n}Output: set C of subsets whose union covers {1 , 2 , … , m}
 C = Ø
 While U ≠ Ø
 X = S_i from  S such that |S_i ∩ U| is maximized
 Remove  X from S
 C = C ∪ { X }
 U = U \  X
 Return C
 |  Correctness:The following shows that the algorithm is \ln n-approximate  with the proof approach from [1]. Another analysis could be found in [2], which  gives a better (\lnB+1) approximate ratio where B is the size of the largest  subset.
 Theorem 1: The greedy algorithm for set cover problem is \ln  n-approximate.Proof. Let k be the number of subsets selected by the  optimal solution. Let U_t be the set of uncovered elements after the greedy  algorithm has done t iterations. Define n_t = |U_t|. Since the k subsets in  optimal solution must cover U_t, there exists a subset B containing at least n_t/k  elements of U_t. Note that the greedy algorithm has not yet selected B  (otherwise the uncovered elements will not be U_t), so at (t+1)-th iteration  the greedy algorithm will be able to select a subset with at least n_t/k  uncovered elements. Then the number of uncovered elements after (t+1)  iterations becomes n_{t+1} ≤ n_t – n_t/k = n_t(1-1/k). Solving  this inequality yields n_t ≤ n_0(1-1/k)^t = n (1-1/k)^t < n e^{-t/k}  (the last inequality comes from: 1-x≤e^{-x}   for all x, and equality holds only when x=0). Then after T = k\ln n  iterations, n_T will be less than 1, which means n_T=0 and there is no element  left. Thus the approximation ratio is T/ k = ln ( n ).
 References[1] Dasgupta, S., Papadimitriou, C. H., and Vazirani, U. V. Algorithms. McGraw-Hill Higher  Education, 2006.
 [2] V. Chvatal. A greedy heuristic for the set-covering  problem. Mathematics of Operations  Research, 4(3):pp. 233-235, 1979.
 [3] http://en.wikipedia.org/wiki/Set_cover_problem
 |