Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments

Kevin Gimpel
Nathan Schneider
Brendan O'Connor
Daniel Mills
Jacob Eisenstein
Michael Heilman
Dani Yogatama
Jeffrey Flanigan
Noah A. Smith
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL 2011), Association of Computational Linguistics

Abstract

We address the problem of part-of-speech tagging for English data from the popular microblogging service Twitter. We develop a tagset, annotate data, develop features, and report tagging results nearing 90% accuracy. The data and tools have been made available to the research community with the goal of enabling richer text analysis of Twitter and related social media data sets.

Research Areas