The k-center problem is a clustering problem, in which we are given a complete undirected graph G=(V,E) with a distance dij≥0 between each pair of vertices i,j∈V. The distances obey the triangle inequality and model similarity. We are also given an integer k.
In the problem, we have to find k clusters that group together the vertices that are most similar into clusters together. We choose a set S⊆V,|S|=k of k cluster centers. Each vertex will assign itself to the closest cluster center grouping the vertices into k different clusters. The objective is to minimize the maximum distance of a vertex to its cluster center. So geometrically, we want to find the centers of k different balls of the same radius r that cover all points so that r is as small as possible.
The optimal algorithm is greedy, and also very simple and intuitive. We first pick a vertex i∈V arbitrarily and put it in our set of S cluster centers. We then pick the next cluster center such that it is as far away as possible from all the other cluster centers. So while |S|<k, we repeatedly find a vertex j∈V for which the distance d(j,S) is maximized and add it to S. Once |S|=k we are done.
The described algorithm is a 2-approximation algorithm for the k-center problem. In fact, if there exists a ρ-approximation algorithm for the problem with ρ<2 then P=NP. This can be shown easily with a reduction from the NP-complete dominating set problem by showing we can find a dominating set of size at most k iff an instance of the k-center problem in which all distances are either 1 or 2 has optimal value 1. The algorithm and analysis is given by Gonzales, Clustering to minimize the maximum intercluster distance, 1985. Another variant of a 2-approximation is given by Hochbaum and Shmoys, A best possible heuristic for the k-center problem, 1985.