Named Entity Transcription with Pair n-Gram Models

Martin Jansche; Richard Sproat

Named Entity Transcription with Pair n-Gram Models

Martin Jansche

Richard Sproat

2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009), ACL-IJCNLP 2009, pp. 32-35

Download Google Scholar

Abstract

We submitted results for each of the eight shared tasks. Except for Japanese name kanji restoration, which uses a noisy channel model, our Standard Run submissions were produced by generative long-range pair ngram models, which we mostly augmented with publicly available data (either from LDC datasets or mined from Wikipedia) for the Non-Standard Runs.

Research Areas

Natural language processing

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Named Entity Transcription with Pair n-Gram Models

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs