Supervised learning with random labelling errors
[摘要] Classical supervised learning from a training set of labelled examples assumes that the labels are correct. But in reality labelling errors may originate, for example, from human mistakes, diverging human opinions, or errors of the measuring instruments. In such cases the training set is misleading and in consequence the learning may suffer. In this thesis we consider probabilistic modelling of random label noise. The goal of this research is two-fold. First, to develop new improved algorithms and architectures from a principled footing which are able to detect and bypass the unwanted effects of mislabelling. Second, to study the performance of such methods both empirically and theoretically. We build upon two classical probabilistic classifiers, the normal discriminant analysis and the logistic regression and introduce the label-noise robust versions of these classifiers. We also develop useful extensions such as a sparse extension and a kernel extension in order to broaden applicability of the robust classifiers. Finally, we devise an ensemble of the robust classifiers in order to understand how the robust models perform collectively. Theoretical and empirical analysis of the proposed models show that the new robust models are superior to the traditional approaches in terms of parameter estimation and classification performance.
[发布日期] [发布机构] University:University of Birmingham;Department:School of Computer Science
[效力级别] [学科分类]
[关键词] Q Science;QA Mathematics;QA75 Electronic computers. Computer science [时效性]