Speech recognition based on statistical models including multiple phonetic decision trees

[摘要] References(13)We propose a speech recognition technique using multiple model structures. In the use of context-dependent models, decision-tree-based context clustering is applied to find an appropriate parameter tying structure. However, context clustering is usually performed on the basis of unreliable statistics of hidden Markov model (HMM) state sequences because the estimation of reliable state sequences requires an appropriate model structures, that cannot be obtained prior to context clustering. Therefore, context clustering and the estimation of state sequences essentially cannot be performed independently. To overcome this problem, we propose an optimization technique of state sequences based on an annealing process using multiple decision trees. In this technique, a new likelihood function is defined in order to treat multiple model structures, and the deterministic annealing expectation maximization algorithm is used as the training algorithm. Experimental continuous phoneme recognition results show that the proposed method of using only two decision trees achieved about an 11.1% relative error reduction over the conventional method.

[发布日期] [发布机构]

[效力级别] [学科分类] 声学和超声波

[关键词] Continuous speech recognition;Acoustic modeling;Context clustering;Phonetic decision tree;Deterministic annealing [时效性]

浏览次数：38

统一登录查看全文激活码登录查看全文