Document Type
Conference Proceeding
Publication Date
4-2007
Publisher
CNIO Centro Nacional de Investigaciones Oncológicas
Source Publication
Proceedings of the Second Biocreative Challenge Evaluation Workshop
Abstract
In this paper, we propose the use of character n-gram and multiple conditional random field (CRF) models for BioCreAtIvE 2 Task 1, gene/protein name recognition. We investigated different state transition weighting schemes for CRFs and discovered that models provided independent nonoverlapping mentions. To improve recall, the results of multiple models are combined. To improve precision, character n-gram models classify gene/protein mention containing sentences. Our best approach achieved a precision of 84.35%, recall of 81.39% and F-measure of 82.85%.
Recommended Citation
Struble, Craig; Povinelli, Richard J.; Johnson, Michael T.; Berchanskiy, Dina; Tao, Jidong; and Trawicki, Marek B., "Combined Conditional Random Fields and n-Gram Language Models for Gene Mention Recognition" (2007). Electrical and Computer Engineering Faculty Research and Publications. 147.
https://epublications.marquette.edu/electric_fac/147
Comments
Published version. Published as part of Proceedings of the Second Biocreative Challenge Evaluation Workshop 2007. Publisher link. © 2005 CNIO Centro Nacional de Investigaciones Oncológicas.