Additive analysis-synthesis using the phase vocoder is a powerful tool for the exploration of musical timbre. In this research, previous investigations of this subject are extended in two significant directions.
First, an improved analysis of the phase vocoder is developed to explain the errors introduced by undersampling and modification of the magnitude and phase-derivative signals. Two sources of error are identified. It is shown that the first of these involves crosstalk between adjacent frequency channels, and can be eliminated through the development of a tracking version of the phase vocoder. Alternatively, restrictions can be placed on the phase-derivative signal to preserve the absolute phase. The second source of error appears to be inherent in the phase vocoder formulation.
Secondly, the tracking phase vocoder is used to investigate differences between solo and ensemble sounds. A search is conducted for the minimal set of cues which will produce an ensemble sensation. It is shown that the primary requirement is that there be at least four to eight harmonics, each of which has a characteristic amplitude modulation proportional to its frequency. In addition, a number of issues related to the quality of the ensemble sensation and its efficient synthesis are examined.