已收录 268921 条政策
 政策提纲
  • 暂无提纲
Pattern recognition of strings with substitutions, insertions, deletions and generalized transposition
[摘要] We study the problem of recognizing a string Y which is the noisy version of some unknown string X* chosen from a finite dictionary, II. The traditional case which has been extensively studied in the literature is the one in which Y contains substitution, insertion and deletion (SID) errors. Although some work has been done to extend the traditional set of edit operations to include the straightforward transposition of adjacent characters(2) [see J. Assoc. Comput. Mach. 22, 177-183 (1975)] the problem is unsolved when the transposed characters are themselves subsequently substituted, as is typical in cursive and typewritten script, in molecular biology and in noisy chain-coded boundaries. In this paper we present the first reported solution to the analytic problem of editing one string X to another, Y, using these four edit operations. A scheme for obtaining the optimal edit operations has also been given. Both these solutions are optimal for the infinite alphabet case. Using these algorithms we present a syntactic pattern recognition scheme which corrects noisy text containing all these types of errors. The paper includes experimental results involving sub-dictionaries of the most common English words which demonstrate the superiority of our system over existing methods. (C) 1997 Pattern Recognition Society.
[发布日期] 1997-05-01 [发布机构] 
[效力级别]  Proceedings Paper [学科分类] 
[关键词] string correction;traditional and generalized transposition errors;syntactic pattern recognition [时效性] 
   浏览次数:1      统一登录查看全文      激活码登录查看全文