已收录 273081 条政策
 政策提纲
  • 暂无提纲
Estimating Community Structure in Networks by Spectral Methods.
[摘要] Networks are studied in a wide range of fields, including social psychology, sociology, physics, computer science, probability, and statistics. One of the fundamental problems in network analysis is the problem of estimating community structure. Most of existing methods rely on maximizing a criterion over the discrete set of community-label vectors. They require a good initial estimate of communities, which is often found by spectral clustering. Several problems arise from this approach: it has been empirically observed and theoretically predicted that spectral clustering fails when the network is sparse; solving an optimization problem over a discrete set of label vectors is a challenging task; and the number of communities is often unknown in practice. This dissertation contributes to progress on each of these problems.We study random graphs with possibly different edge probabilities in the challenging sparse regime of bounded expected degrees. Unlike in the dense case, neither the network adjacency matrix nor its Laplacian concentrate around their expectations due to the highly irregular distribution of nodedegrees. We prove that simple regularization procedures force the adjacency matrix and the Laplacian to concentrate. As an immediate consequence, we establish the validity of regularized spectral clustering for estimating community structure.We propose a general approach for maximizing a function of a network adjacency matrix over discrete labels by projecting the set of labels onto a subspace approximating the leading eigenvectors of the expected adjacency matrix. This projection onto a low-dimensional space makes the feasible set of labels much smaller and the optimization problem much easier. We prove a general result about this method and show how to apply it to several previously proposed community estimation criteria, establishing its consistency for label estimation in each case.We propose a simple spectral method for estimating the number of communities. We show that the method performs well under several models and a wide range of parameters, and is guaranteed to be consistent under several asymptotic regimes. We compare the new method to several existing methods for estimating the number of communities and show that it is both more accurate and morecomputationally efficient.
[发布日期]  [发布机构] University of Michigan
[效力级别] Community detection [学科分类] 
[关键词] Network analysis;Community detection;Concentration of sparse random graphs;Computer Science;Mathematics;Science (General);Statistics and Numeric Data;Engineering;Science;Statistics [时效性] 
   浏览次数:25      统一登录查看全文      激活码登录查看全文