Google Research


ICASSP 2023 (2023) (to appear)


We present a multi-channel audio signal generation scheme based on machine-learning and probabilistic modeling. We start from modeling a multi-channel single-source signal. Such signals are naturally modeled as a single-channel reference signal and a spatial-arrangement (SA) model specified by an SA parameter sequence.We focus on the SA model and assume that the reference signal is described by some parameter sequence. The SA model parameters are described with a learned probability distribution that is conditioned by the reference-signal parameter sequence and, optionally, an SA conditioning sequence. If present, the SA conditioning sequence specifies a signal class or a specific signal. The single-source method can be used for multi-source signals by applying source separation or by using an SA model that operates on non-overlapping frequency bands. Our GAN-based stereo coding implementation of the latter approach shows that our paradigm facilitates plausible high-quality rendering at a low bit rate for the SA conditioning.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work