Exploration and exploitation of multilingual data for statistical machine translation
[摘要] Shortly after the birth of computer science, researchers realised the importance of machine translation as a task worth of,concentrated effort, but it is only recently that algorithms are able to provide automatic translations usable by the masses.,Modern translation systems are dependent on bilingual corpora, a modern Rosetta Stone, from which the learn cross-lingual,relationships that can be used to translate sentences which are not in the training corpus. This data is crucial. If it is,insufficient, or out-of-domain, then translation quality degrades. To improve quality, we need to both perfect methods that,extract usable translation from additional multilingual resources, and improve the constituent models of a translation system,to better exploit existing multilingual data sets.
[发布日期] [发布机构] University of Amterdam
[效力级别] [学科分类]
[关键词] [时效性]