ASTRA: Aligning Speech and Text Representations for Asr without Sampling
Abstract
This paper discusses a method to inject text when training an ASR system without the need for up sampling the text sequence to match the length of the speech sequence.