Alexander Gutkin

Research Areas

Authored Publications

Google Publications

Other Publications

XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages

Sebastian Ruder

Jon Clark

Alexander Gutkin

Mihir Sanjay Kale

Min Ma

Massimo Nicosia

Shruti Rijhwani

Parker Riley

Jean-Michel Sarr

Cindy Wang

John Wieting

Nitish Gupta

Anna Katanova

Christo Kirov

Dana L. Dickinson

Brian Roark

Bidisha Samanta

Connie Tao

David Adelani

Vera Axelrod

Isaac Caswell

Colin Cherry

Dan Garrette

Reeve Ingle

Melvin Johnson

Dmitry Panteleev

Partha Talukdar

Findings of the Association for Computational Linguistics: EMNLP 2023, Association for Computational Linguistics, Singapore, pp. 1856-1884

Helpful Neighbors: Leveraging Neighbors in Geographic Feature Pronunciation

Lion Jones

Richard William Sproat

Haruko Ishikawa

Alexander Gutkin

Transactions of the Association for Computational Linguistics, vol. 11 (2023), 85–101

Design principles of an open-source language modeling microservice package for AAC text-entry applications

Brian Edward Roark

Alexander Gutkin

9th Workshop on Speech and Language Processing for Assistive Technologies (SLPAT-2022), Association for Computational Linguistics (ACL), Dublin, Ireland, pp. 1-16

Building Machine Translation Systems for the Next Thousand Languages

Ankur Bapna

Isaac Caswell

Julia Kreutzer

Orhan Firat

Daan van Esch

Aditya Siddhant

Mengmeng Niu

Pallavi Nikhil Baljekar

Xavier Garcia

Wolfgang Macherey

Theresa Breiner

Vera Saldinger Axelrod

Jason Riesa

Yuan Cao

Mia Chen

Klaus Macherey

Maxim Krikun

Pidong Wang

Alexander Gutkin

Apu Shah

Yanping Huang

Zhifeng Chen

Yonghui Wu

Macduff Richard Hughes

Google Research (2022)

Mockingbird at the SIGTYP 2022 Shared Task: Two Types of Models for Prediction of Cognate Reflexes

Christo Kirov

Richard Sproat

Alexander Gutkin

Proceedings of the 4th Workshop on Research in Computational Typology and Multilingual NLP (SIGTYP 2022) at NAACL, Association for Computational Linguistics (ACL), Seattle, WA, pp. 70-79

Graphemic Normalization of the Perso-Arabic Script

Raiomond Doctor

Alexander Gutkin

Cibu Johny

Brian Roark

Richard Sproat

Proceedings of Grapholinguistics in the 21st Century, 2022 (G21C, Grafematik), Paris, France

Criteria for Useful Automatic Romanization in South Asian Languages

Isin Demirsahin

Cibu Johny

Alexander Gutkin

Brian Edward Roark

Proceedings of the 13th Language Resources and Evaluation Conference.(LREC), European Language Resources Association (ELRA), 20-25 June, Marseille, France (2022), 6662‑6673

Extensions to Brahmic script processing within the Nisaba library: new scripts, languages and utilities

Alexander Gutkin

Cibu Johny

Raiomond Doctor

Lawrence Wolf-Sonkin

Brian Edward Roark

Proceedings of the 13th Language Resources and Evaluation Conference.(LREC), European Language Resources Association (ELRA), 20-25 June, Marseille, France (2022), 6450‑6460

Beyond Arabic: Software for Perso-Arabic Script Manipulation

Alexander Gutkin

Cibu Johny

Raiomond Doctor

Brian Roark

Richard Sproat

Proceedings of the 7th Arabic Natural Language Processing Workshop (WANLP2022) at EMNLP, Association for Computational Linguistics (ACL), Abu Dhabi, United Arab Emirates (Hybrid), pp. 381-387

Finite-state script normalization and processing utilities: The Nisaba Brahmic library

Cibu C Johny

Lawrence Wolf-Sonkin

Alexander Gutkin

Brian Edward Roark

The 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2021): System Demonstrations, Association for Computational Linguistics, [Online], Kyiv, Ukraine, April, 2021, pp. 14-23

The Taxonomy of Writing Systems: How to Measure how Logographic a System is

Richard William Sproat

Alexander Gutkin

Computational Linguistics, vol. 47(3) (2021), 477–528

Eidos: An Open-Source Auditory Periphery Modeling Toolkit and Evaluation of Cross-Lingual Phonemic Contrasts

Alexander Gutkin

Proc. of 1st Joint Spoken Language Technologies for Under-Resourced Languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL) Workshop (SLTU-CCURL 2020), European Language Resources Association (ELRA), 11--12 May, Marseille, France, pp. 9-20

Crowdsourcing Latin American Spanish for Low-Resource Text-to-Speech

Adriana Guevara-Rukoz

Isin Demirsahin

Fei He

Shan Hui Cathy Chu

Supheakmungkol Sarin

Knot Pipatsrisawat

Alexander Gutkin

Alena Butryna

Oddur Kjartansson

Proc. 12th Language Resources and Evaluation Conference (LREC 2020), European Language Resources Association (ELRA), 11--16 May, Marseille, France, pp. 6504-6513

NEMO: Frequentist Inference Approach to Constrained Linguistic Typology Feature Prediction in SIGTYP 2020 Shared Task

Alexander Gutkin

Richard William Sproat

Association for Computational Linguistics (ACL), 19th November, Online, pp. 17-28

Towards Induction of Structured Phoneme Inventories

Alexander Gutkin

Martin Jansche

Lucy Skidmore

Association of Computational Linguistics (ACL), 19th November, Online

Developing an Open-Source Corpus of Yoruba Speech

Alexander Gutkin

Isin Demirsahin

Oddur Kjartansson

Clara E. Rivera

Kólá Túbòsún

Proc. of Interspeech 2020, International Speech Communication Association (ISCA), October 25--29, Shanghai, China, 2020., pp. 404-408

Open-source Multi-speaker Speech Corpora for Building Gujarati, Kannada, Malayalam, Marathi, Tamil and Telugu Speech Synthesis Systems

Fei He

Shan Hui Cathy Chu

Oddur Kjartansson

Clara E. Rivera

Anna Katanova

Alexander Gutkin

Isin Demirsahin

Cibu C Johny

Martin Jansche

Supheakmungkol Sarin

Knot Pipatsrisawat

Proc. 12th Language Resources and Evaluation Conference (LREC 2020), European Language Resources Association (ELRA), 11--16 May, Marseille, France, 6494‑-6503

Open-Source High Quality Speech Datasets for Basque, Catalan and Galician

Oddur Kjartansson

Alexander Gutkin

Alena Butryna

Isin Demirsahin

Clara E. Rivera

Does A Priori Phonological Knowledge Improve Cross-Lingual Robustness of Phonemic Contrasts?

Lucy Skidmore

Alexander Gutkin

22nd International Conference on Speech and Computer (SPECOM 2020), Springer, St. Petersburg, Russia, pp. 530-543

Burmese Speech Corpus, Finite-State Text Normalization and Pronunciation Grammars with an Application to Text-to-Speech

Yin May Oo

A Theeraphol

Chen Fang Li

Pasindu De Silva

Supheakmungkol Sarin

Knot Pipatsrisawat

Martin Jansche

Oddur Kjartansson

Alexander Gutkin

Proc. 12th Language Resources and Evaluation Conference (LREC 2020), European Language Resources Association (ELRA), 11--16 May, Marseille, France, pp. 6328-6339

Open-source Multi-speaker Corpora of the English Accents in the British Isles

Isin Demirsahin

Oddur Kjartansson

Alexander Gutkin

Clara E. Rivera

Proc. 12th Language Resources and Evaluation Conference (LREC 2020), European Language Resources Association (ELRA), 11--16 May, Marseille, France, 6532‑-6541

Google Crowdsourced Speech Corpora and Related Open-Source Resources for Low-Resource Languages and Dialects: An Overview

Alena Butryna

Shan Hui Cathy Chu

Isin Demirsahin

Alexander Gutkin

Linne Ha

Fei He

Martin Jansche

Cibu C Johny

Anna Katanova

Oddur Kjartansson

Chen Fang Li

Tatiana Merkulova

Yin May Oo

Knot Pipatsrisawat

Clara E. Rivera

Supheakmungkol Sarin

Pasindu De Silva

Keshan Sodimana

Richard Sproat

Theeraphol Wattanavekin

Jaka Aris Eko Wibawa

2019 UNESCO International Conference Language Technologies for All (LT4All): Enabling Linguistic Diversity and Multilingualism Worldwide, 4--6 December, Paris, France, pp. 91-94

Cross-Lingual Consistency of Phonological Features: An Empirical Study

Cibu C Johny

Alexander Gutkin

Martin Jansche

Proc. of Interspeech 2019 (20th Annual Conference of the International Speech Communication Association), International Speech Communication Association (ISCA), September 15--19, Graz, Austria, pp. 1741-1745

Sampling from Stochastic Finite Automata with Applications to CTC Decoding

Martin Jansche

Alexander Gutkin

A Unified Phonological Representation of South Asian Languages for Multilingual Text-to-Speech

Isin Demirsahin

Martin Jansche

Alexander Gutkin

Proc. The 6th Intl. Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU), International Speech Communication Association (ISCA), 29--31 August, Gurugram, India (2018), pp. 80-84

Building Open Javanese and Sundanese Corpora for Multilingual Text-to-Speech

Jaka Aris Eko Wibawa

Supheakmungkol Sarin

Chen Fang Li

Knot Pipatsrisawat

Keshan Sodimana

Oddur Kjartansson

Alexander Gutkin

Martin Jansche

Linne Ha

Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), European Language Resources Association (ELRA), 7-12 May 2018, Miyazaki, Japan, pp. 1610-1614

Text Normalization for Bangla, Khmer, Nepali, Javanese, Sinhala, and Sundanese TTS Systems

Keshan Sodimana

Pasindu De Silva

Richard Sproat

A Theeraphol

Chen Fang Li

Alexander Gutkin

Supheakmungkol Sarin

Knot Pipatsrisawat

6th International Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU-2018), International Speech Communication Association (ISCA), 29--31 August, Gurugram, India, pp. 147-151

FonBund: A Library for Combining Cross-lingual Phonological Segment Data

Alexander Gutkin

Martin Jansche

Tatiana Merkulova

Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), European Language Resources Association (ELRA), 7-12 May 2018, Miyazaki, Japan, pp. 2236-2240

Linguistic Typology Features from Text: Inferring the Sparse Features of World Atlas of Language Structures

Alexander Gutkin

Tatiana Merkulova

Martin Jansche

arXiv preprint arXiv:2005.00100 (2018)

Predicting the Features of World Atlas of Language Structures from Speech

Alexander Gutkin

Tatiana Merkulova

Martin Jansche

Uniform Multilingual Multi-Speaker Acoustic Model for Statistical Parametric Speech Synthesis of Low-Resourced Languages

Alexander Gutkin

Proc. of Interspeech 2017, International Speech Communication Association (ISCA), August 20--24, Stockholm, Sweden, pp. 2183-2187

Areal and Phylogenetic Features for Multilingual Speech Synthesis

Alexander Gutkin

Richard Sproat

Proc. of Interspeech 2017, International Speech Communication Association (ISCA), August 20–24, 2017, Stockholm, Sweden, pp. 2078-2082

Recent Advances in Google Real-time HMM-driven Unit Selection Synthesizer

Xavi Gonzalvo

Siamak Tazari

Chun-an Chan

Markus Becker

Alexander Gutkin

Hanna Silen

International Speech Communication Association (ISCA), Sep 8--12, San Francisco, USA, pp. 2238-2242

TTS for Low Resource Languages: A Bangla Synthesizer

Alexander Gutkin

Linne Ha

Martin Jansche

Knot Pipatsrisawat

Richard Sproat

10th edition of the Language Resources and Evaluation Conference, 23-28 May 2016, European Language Resources Association (ELRA), Portorož, Slovenia, pp. 2005-2010

Building Statistical Parametric Multi-speaker Synthesis for Bangladeshi Bangla

Alexander Gutkin

Linne Ha

Martin Jansche

Oddur Kjartansson

Knot Pipatsrisawat

Richard Sproat

5th Workshop on Spoken Language Technologies for Under-resourced languages (SLTU-2016), Procedia Computer Science (Elsevier B.V.), 09--12 May 2016, Yogyakarta, Indonesia, pp. 194-200

No Results Found

Search on Google Scholar

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Alexander Gutkin

Research Areas

Join us

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Alexander Gutkin

Research Areas

Filter by:

Year

Team

Research Area

Join us

AI/ML Foundations  & Capabilities