Computational Complexity of Bi-clustering.
[摘要] Bi-clustering, i.e. simultaneously clustering the rows and columns of matrices based on their entries, covers a large variety of techniques in data mining. The goal of all bi-clustering techniques is finding the partitions of the rows and the columns in which sub-rows and sub-columns show a similar behavior. Currently existing algorithms for bi-clustering problems are either heuristic, or try to solve approximations of the original problems. There is no efficient algorithm for exact bi-clustering problems.The computational complexity of bi-clustering problems depends on the exact problem formulation, and particularly on the merit function used to evaluate the quality of a given bi-clustering partition. The computational complexity of most of the common bi-clustering problems is unknown. In this thesis, we present a formal definition for the homogeneous cover problem. This problem has many applications from bio-informatics to targeted marketing. We analyze its computational complexity and show that the problem is NP-hard.
[发布日期] [发布机构] University of Waterloo
[效力级别] [学科分类]
[关键词] Computer Science [时效性]