Ensemble learning-based approach for improving generalization capability of machine reading comprehension systems
[摘要] Machine Reading Comprehension (MRC) is an active field in natural language processing with many successful developed models in recent years. Despite their high in-distribution accuracy, these models suffer from two issues: high training cost and low out-of-distribution accuracy. Even though some approaches have been presented to tackle the generalization problem, they have high, intolerable training costs. In this paper, we investigate the effect of ensemble learning as a light approach to improve out-of distribution generalization of MRC systems by aggregating the outputs of some pre-trained base models without retraining a big model. After separately training the base models with different structures on different datasets, they are ensembled using weighting and stacking approaches in probabilistic and non probabilistic settings. Three configurations are investigated including heterogeneous, homogeneous, and hybrid on eight datasets and six state-of-the-art models. We identify the important factors in the effectiveness of ensemble methods. Also, we compare the robustness of ensemble and fine-tuned models against data distribution shifts. The experimental results show the effectiveness and robustness of the ensemble approach in improving the out-of-distribution accuracy of MRC systems, especially when the base models are similar in accuracies. (c) 2021 Elsevier B.V. All rights reserved.
[发布日期] 2021-11-27 [发布机构]
[效力级别] [学科分类]
[关键词] Natural Language Processing;Machine Reading Comprehension;Ensemble learning [时效性]