Document Type

Conference Proceeding

Publication Date

4-2007

Publisher

CNIO Centro Nacional de Investigaciones Oncológicas

Source Publication

Proceedings of the Second Biocreative Challenge Evaluation Workshop

Abstract

In this paper, we propose the use of character n-gram and multiple conditional random field (CRF) models for BioCreAtIvE 2 Task 1, gene/protein name recognition. We investigated different state transition weighting schemes for CRFs and discovered that models provided independent nonoverlapping mentions. To improve recall, the results of multiple models are combined. To improve precision, character n-gram models classify gene/protein mention containing sentences. Our best approach achieved a precision of 84.35%, recall of 81.39% and F-measure of 82.85%.

Comments

Published version. Published as part of Proceedings of the Second Biocreative Challenge Evaluation Workshop 2007. Publisher link. © 2005 CNIO Centro Nacional de Investigaciones Oncológicas.

Share

COinS