Penn/UMass/CHOP Biocreative II systems

Koby Crammer
Gideon Mann
Kedar Bellare
Andrew McCallum
Steven Carroll
Yang Jin
Peter White
Proceedings of the Second BioCreative Challenge Evaluation Workshop(2007), pp. 119-124

Abstract

Our team participated in the entity tagging and normalization tasks of Biocreative II. For the entity tagging task, we used a k-best MIRA learning algorithm with lexicons and automatically derived word clusters. MIRA accommodates different training loss functions, which allowed us to exploit gene alternatives in training. We also performed a greedy search over feature templates and the development data, achieving a final F-measure of 86.28%. For the normalization task, we proposed a new specialized on-line learning algorithm and applied it for filtering out false positives from a high recall list of candidates. For normalization we received an F-measure of 69.8%.

Research Areas