Nonparametric Bayesian methods for supervised and unsupervised learning
[摘要] I introduce two nonparametric Bayesian methods for solving problems of supervised and unsupervised learning. The first method simultaneously learns causal networks and causal theories from data. For example, given synthetic co-occurrence data from a simple causal model for the medical domain, it can learn relationships like ;;having a flu causes coughing;;, while also learning that observable quantities can be usefully grouped into categories like diseases and symptoms, and that diseases tend to cause symptoms, not the other way around. The second method is an online algorithm for learning a prototype-based model for categorial concepts, and can be used to solve problems of multiclass classification with missing features. I apply it to problems of categorizing newsgroup posts and recognizing handwritten digits. These approaches were inspired by a striking capacity of human learning, which should also be a desideratum for any intelligent system: the ability to learn certain kinds of ;;simple;; or ;;natural;; structures very quickly, while still being able to learn arbitrary -- and arbitrarily complex - structures given enough data. In each case, I show how nonparametric Bayesian modeling and inference based on stochastic simulation give us some of the tools we need to achieve this goal.
[发布日期] [发布机构] Massachusetts Institute of Technology
[效力级别] [学科分类]
[关键词] [时效性]