Speech reconstruction using a deep partially supervised neural network
[摘要] Statistical speech reconstruction for larynx-related dysphonia has achieved good performance using Gaussian mixture models and, more recently, restricted Boltzmann machine arrays; however, deep neural network (DNN)-based systems have been hampered by the limited amount of training data available from individual voice-loss patients. The authors propose a novel DNN structure that allows a partially supervised training approach on spectral features from smaller data sets, yielding very good results compared with the current state-of-the-art.
[发布日期] [发布机构]
[效力级别] [学科分类] 肠胃与肝脏病学
[关键词] speech processing;medical signal processing;medical disorders;Boltzmann machines;statistical speech reconstruction;deep partially supervised neural network;larynx related dysphonia;Gaussian mixture models;restricted Boltzmann machine arrays;voice-loss patients;DNN structure;partially supervised training approach [时效性]