Boosting Arabic Named Entity Recognition with K-Fold Cross Validation on LSTM and Bi-LSTM Models
[摘要] Named-Entity-Recognition(NER) is one of the most important Information-Extraction (IE) use cases, whichis used to improve the performance of Natural Languages Processing (NLP) tasks,such as Relation-Extraction (RE), Question-Answering (QA). Recently, Arabic NER is tackled in differentways by researchers. In this study, we assess the performance of two widelyused models, namely, LSTM and Bi-LSTM on the NER task in the Arabic languageand perform a comparative study between these models. In contrast to thetraditional data partition technique widely used during the training, we employthe technique of k-fold cross-validation to improve the performance of eachmodel. The experimental results reveal that the performance of all models isimproved when k-fold cross-validation is applied. Additionally, according toour experiment results, the Bi-LSTM model outperforms the LSTM model in termsof our evaluation metric. We achieve the best F1 score of 94.17% withCNN-Bi-LSTM-CRF. An ablation study on k-fold cross-validation demonstrates thatthe F1 score increased from 87.28 to 94.17%.
[发布日期] [发布机构]
[效力级别] [学科分类] 计算机科学(综合)
[关键词] Arabic Named Entity Recognition;LSTM;BiLSTM;K-Fold Cross Validation [时效性]