The impact of training set size and feature dimensionality on supervised object-based classification : a comparison of three classifiers
[摘要] ENGLISH ABSTRACT: Supervised classifiers are commonly used in remote sensing to extract land cover information.They are, however, limited in their ability to cost-effectively produce sufficiently accurateland cover maps. Various factors affect the accuracy of supervised classifiers. Notably, thenumber of available training samples is known to significantly influence classifierperformance and to obtain a sufficient number of samples is not always practical. The supportvector machine (SVM) does perform well with a limited number of training samples. But littleresearch has been done to evaluate SVM's performance for geographical object-based imageanalysis (GEOBIA). GEOBIA also allows the easy integration of additional features into theclassification process, a factor which may significantly influence classification accuracies. Assuch, two experiments were developed and implemented in this research. The first comparedthe performances of object-based SVM, maximum likelihood (ML) and nearest neighbour(NN) classifiers using varying training set sizes. The effect of feature dimensionality onclassifier accuracy was investigated in the second experiment.A SPOT 5 subscene and a four-class classification scheme were used. For the firstexperiment, training set sizes ranging from 4-20 per land cover class were tested. Theperformance of all the classifiers improved significantly as the training set size was increased.The ML classifier performed poorly when few (<10 per class) training samples were used andthe NN classifier performed poorly compared to SVM throughout the experiment. SVM wasthe superior classifier for all training set sizes although ML achieved competitive results forsets of 12 or more training samples per class. Training sets were kept constant (20 and 10samples per class) for the second experiment while an increasing number of features (1 to 22)were included. SVM consistently produced superior classification results. SVM and NN werenot significantly (negatively) affected by an increase in feature dimensionality, but ML'sability to perform under conditions of large feature dimensionalities and few training areaswas limited.Further investigations using a variety of imagery types, classification schemes and additionalfeatures; finding optimal combinations of training set size and number of features; anddetermining the effect of specific features should prove valuable in developing more costeffectiveways to process large volumes of satellite imagery.KEYWORDSSupervised classification, land cover, support vector machine, nearest neighbour classificationmaximum likelihood classification, geographic object-based image analysis
[发布日期] [发布机构] Stellenbosch University
[效力级别] [学科分类]
[关键词] [时效性]