Stanford University’s Chinese-to-English Statistical Machine Translation System for the 2008 NIST Evaluation

Michel Galley

Pi-chuan Chang

Daniel Cer

Jenny R. Finkel

Christopher D. Manning

The 2008 NIST Open Machine Translation Evaluation Meeting(2008)

Google Scholar

Abstract

This document describes Stanford University’s first entry into a NIST MT evaluation. Our entry to the 2008 evaluation mainly focused on establishing a competent baseline with a phrase-based system similar to (Och and Ney, 2004; Koehn et al., 2007). In a three-week effort prior to the evaluation, our attention focused on scaling up our system to exploit nearly all Chinese-English parallel data permissible under the constrained track, incorporating competitive language models into the decoder using Gigaword and Google n-grams, evaluating Chinese word segmentation models, and incorporating a document classifier as a pre-processing stage to the decoder. This document is organized as follows: in Section 2, we describe linguistic resources used for our submission. In Section 3, we present the four main components of our translation system, i.e., a phrase-based translation system, a Chinese word segmenter, a text categorizer, and a truecaser. Finally, we discuss our results in Section 4.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Stanford University’s Chinese-to-English Statistical Machine Translation System for the 2008 NIST Evaluation

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Stanford University’s Chinese-to-English Statistical Machine Translation System for the 2008 NIST Evaluation

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities