Balint Miklos
Balint is a tech lead on Google Lens working on image and text based classification and content generation systems. He lead the development of Multisearch, a search quality system to produce relevant results based on image and text signals, i.e. multimodal search.
In the past, he also lead the development of punctuation prediction systems for speech recognition as part of Google Assistant org.
Previously, Balint was leading for 8+ years a group in Gmail and Workspace to build magical user experiences using machine learning. Among others he lead the development of Smart Reply in Gmail, email classifiers to Promotional, Social, etc. and advised and mentored the leads of the projects Smart Compose and Spelling and Grammar suggestions in Gmail and Google Docs. Balint earned his PhD in computer science from ETH Zurich in 2010.
Authored Publications
Sort By
REPLACING HUMAN-RECORDED AUDIO WITH SYNTHETIC AUDIOFOR ON-DEVICE UNSPOKEN PUNCTUATION PREDICTION
Bogdan Prisacari
Daria Soboleva
Felix Weissenberger
Justin Lu
Márius Šajgalík
ICASSP 2021: International Conference on Acoustics, Speech and Signal Processing (2021) (to appear)
Preview abstract
We present a novel multi-modal unspoken punctuation prediction system for the English language, which relies on Quasi-Recurrent Neural Networks (QRNNs) applied jointly on the text output from automatic speech recognition and acoustic features.
%
We show significant improvements from adding acoustic features compared to the text-only baseline. Because annotated acoustic data is hard to obtain, we demonstrate that relying on only 20% of human-annotated audio and replacing the rest with synthetic text-to-speech (TTS) predictions, does not suffer from quality loss on LibriTTS corpus.
%
Furthermore, we demonstrate that through data augmentation using TTS models, we can remove human-recorded audio completely and outperform models trained on it.
View details
Efficient Natural Language Response Suggestion for Smart Reply
Matthew Henderson
Rami Al-Rfou
Brian Strope
László Lukács
Ray Kurzweil
ArXiv e-prints (2017)
Preview abstract
This paper presents a computationally efficient machine-learned method for natural language response suggestion. Feed-forward neural networks using n-gram embedding features encode messages into vectors which are optimized to give message-response pairs a high dot-product value. An optimized search finds response suggestions. The method is evaluated in a large-scale commercial e-mail application, Inbox by Gmail. Compared to a sequence-to-sequence approach, the new system achieves the same quality at a small fraction of the computational requirements and latency.
View details
Template Induction over Unstructured Email Corpora
Lluís Garcia-Pueyo
Ivo Krka
Tobias Kaufmann
Proc. of the 26th International World Wide Web Conference (2017), pp. 1521-1530
Preview abstract
Unsupervised template induction over email data is a central component in applications such as information extraction, document classification, and auto-reply. The benefits of automatically generating such templates are known for structured data, e.g. machine generated HTML emails. However much less work has been done in performing the same task over unstructured email data.
We propose a technique for inducing high quality templates from plain text emails at scale based on the suffix array data structure. We evaluate this method against an industry-standard approach for finding similar content based on shingling, running both algorithms over two corpora: a synthetically created email corpus for a high level of experimental control, as well as user-generated emails from the well-known Enron email corpus. Our experimental results show that the proposed method is more robust to variations in cluster quality than the baseline and templates contain more text from the emails, which would benefit extraction tasks by identifying transient parts of the emails.
Our study indicates templates induced using suffix arrays contain approximately half as much noise (measured as entropy) as templates induced using shingling. Furthermore, the suffix array approach is substantially more scalable, proving to be an order of magnitude faster than shingling even for modestly-sized training clusters.
Public corpus analysis shows that email clusters contain on average 4 segments of common phrases, where each of the segments contains on average 9 words, thus showing that templatization could help users reduce the email writing effort by an average of 35 words per email in an assistance or auto-reply related task.
View details
Smart Reply: Automated Response Suggestion for Email
Karol Kurach
Sujith Ravi
Tobias Kaufman
Laszlo Lukacs
Peter Young
Vivek Ramavajjala
Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) (2016).
Preview abstract
In this paper we propose and investigate a novel end-to-end method for automatically generating short email responses, called Smart Reply. It generates semantically diverse suggestions that can be used as complete email responses with just one tap on mobile. The system is currently used in Inbox by Gmail and is responsible for assisting with 10% of all mobile responses. It is designed to work at very high throughput and process hundreds of millions of messages daily. The system exploits state-of-the-art, large-scale deep learning.
We describe the architecture of the system as well as the challenges that we faced while building it, like response diversity and scalability. We also introduce a new method for semantic clustering of user-generated content that requires only a modest amount of explicitly labeled data.
View details
Hierarchical Label Propagation and Discovery for Machine Generated Email
Lluis Garcia-Pueyo
Vanja Josifovski
Ivo Krka
Amitabh Saikia
Jie Yang
Sujith Ravi
Proceedings of the International Conference on Web Search and Data Mining (WSDM), ACM (2016), pp. 317-326
Preview abstract
Machine-generated documents such as email or dynamic web pages are single instantiations of a pre-defined structural template. As such, they can be viewed as a hierarchy of template and document specific content. This hierarchical template representation has several important advantages for document clustering and classification. First, templates capture common topics among the documents, while filtering out the potentially noisy variabilities such as personal information. Second, template representations scale far better than document representations since a single template captures numerous documents. Finally, since templates group together structurally similar documents, they can propagate properties between all the documents that match the template. In this paper, we use these advantages for document classification by formulating an efficient and effective hierarchical label propagation and discovery algorithm. The labels are propagated first over a template graph (constructed based on either term-based or topic-based similarities), and then to the matching documents. We evaluate the performance of the proposed algorithm using a large donated email corpus and show that the resulting template graph is significantly more compact than the corresponding document graph and the hierarchical label propagation is both efficient and effective in increasing the coverage of the baseline document classification algorithm. We demonstrate that the template label propagation achieves more than 91% precision and 93% recall, while increasing the label coverage by more than 11%.
View details