Acoustic segmentation of speech
[摘要] A brief history of speech research is given along with the current state of the art in acoustic speech recognition. The problem of speech segmentation in the acoustic domain using a digital computer is specific-ally addressed, i.e. determining an acoustic partition in time which has linguistic relevance. This problem is viewed, in more general terms,as that of detecting transitions, in a globally nonstationary process, from one local stationary state to another. Nonstationary analyses are approximated by considering short fixed length time series sections as seen through a window which moves by a fixed increment.Various nonstationary signal representations are explored in order to establish a feature space suitable for segmentation applications. Spectral representations are only generated as a reference space usedto compare any mechanical segmentation procedure with the linguistically determined segmentation of any given speech sample. Temporal representations of the zero crossings of speech signals are explored in detail.In particular the central sample moments of the reciprocal zero crossings as a function of time are used as input to a simple segmentation algorithm. The results of a demonstration of this algorithm show that speech segmentation as defined is possible by nonhuman means.
[发布日期] [发布机构] Rice University
[效力级别] [学科分类]
[关键词] [时效性]