Eliya Nachmani

Eliya Nachmani is a Research Scientist at Google Research. He completed his Ph.D. at Tel Aviv University under the supervision of Prof. Lior Wolf. Eliya’s research spans deep learning, audio and speech processing, signal processing, and information theory. His recent work includes advancements in spoken language modeling, speech-to-speech translation, and source separation. For full list of publication and more information: personal website
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract We present Spectron, a novel approach to adapting pre-trained large language models (LLMs) to perform spoken question answering (QA) and speech continuation. By endowing the LLM with a pre-trained speech encoder, our model becomes able to take speech inputs and generate speech outputs. The entire system is trained endto-end and operates directly on spectrograms, simplifying our architecture. Key to our approach is a training objective that jointly supervises speech recognition, text continuation, and speech synthesis using only paired speech-text pairs, enabling a ‘cross-modal’ chain-of-thought within a single decoding pass. Our method surpasses existing spoken language models in speaker preservation and semantic coherence. Furthermore, the proposed model improves upon direct initialization in retaining the knowledge of the original LLM as demonstrated through spoken QA datasets. We release our audio samples and spoken QA dataset via our website. View details