Sensitivity analysis of multilayer neural networks
[摘要] ENGLISH ABSTRACT: The application of artificial neural networks to solve classification and function approximationproblems is no longer an art. Using a neural network does not simply imply the presentationof a data set to the network and relying on the so-called black-box to produce - hopefullyaccurate - results. Rigorous mathematical analysis now provides a much better understandingof what is going. on inside the black-box. The knowledge gained from these mathematicalstudies allows the development of specialized tools to increase performance, robustness andefficiency.This thesis proposes that sensitivity analysis of the neural network output function be usedto learn more about the inner working of multilayer feedforward neural networks. New sensitivityanalysis techniques are developed to probe the knowledge embedded in the weightsof networks, and to use this knowledge within specialized sensitivity analysis algorithms toimprove generalization performance, to reduce learning and model complexity, and to improveconvergence performance.A general mathematical model is developed which uses first order derivatives of the neuralnetwork output function with respect to the network parameters to quantify the effect smallperturbations to these network parameters have on the output of the network. This sensitivityanalysis model is then used to develop techniques to locate and visualize decision boundaries,and to determine which boundaries are implemented by which hidden units. The decisionboundary detection algorithm is then used to develop an active learning algorithm for classificationproblems which trains only on patterns close to decision boundaries. Patterns thatconvey little information about the position of boundaries are therefore not used for training.An incremental learning algorithm for function approximation problems is also developed to incrementally grow the training set from a candidate set by adding to the training set thosepatterns that convey the most information about the function to be approximated. The sensitivityof the network output to small perturbations of the input pattern is used as measure ofpattern informativeness. Sensitivity analysis is also used to develop a network pruning algo-rithm to remove irrelevant network parameters. The significance of a parameter is quantifiedas the influence small perturbations on that parameter have on the network output. Varianceanalysis is employed as pruning heuristic to decide if a parameter should be removed or not.Elaborate experimental evidence is provided to illustrate how each one of the developedsensitivity analysis techniques addresses the objectives of improved performance, robustnessand efficiency. These results show that the different models successfully utilize the neuralnetwork learner's current knowledge to obtain optimal architectures and to make optimal useof the available training data.
[发布日期] [发布机构] Stellenbosch University
[效力级别] [学科分类]
[关键词] [时效性]