Larry Smith, National Center for Biotechnology Information
Lorraine K. Tanabe, National Center for Biotechnology Information
Rie Johnson nee Ando, IBM TJ Watson Research
Cheng-Ju Kuo, National Yang-Ming University
I-Fang Chung, National Yang Ming University
Chun-Nan Hsu, Academia Sinica
Yu-Shi Lin, Academia Sinica
Roman Klinger, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI)
Christoph M. Friedrich, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI)
Kuzman Ganchev, University of Pennsylvania
Manabu Torii, Georgetown University Medical Center
Hongfang Liu, Georgetown University Medical Center
Barry Haddow, University of Edinburgh
Craig A. Struble, Marquette UniversityFollow
Richard J. Povinelli, Marquette UniversityFollow
Andreas Vlachos, University of Cambridge
William A. Baumgartner, University of Colorado School of Medicine
Lawrence Hunter, University of Colorado School of Medicine
Bob Carpenter, Alias-i, Inc.
Richard Tzong-Han Tsai, Academia Sinica
Hong-Jie Dai, Academia Sinica
Feng Liu, Vrije Universiteit Brussels
Yifei Chen, Vrije Universiteit Brussels
Chengjie Sun, Harbin Institute of Technology
Sophia Katrenko, University of Amsterdam
Pieter Adriaans, University of Amsterdam
Christian Blaschke, Tres Cantos (Madrid)
Rafael Torres, Tres Cantos (Madrid)
Mariana Neves, Universidad Complutense de Madrid
Preslav Nakov, University of California - Berkeley
Anna Divoli, University of California - Berkeley
Manuel Maña-López, Universidad de Huelva
Jacinto Mata, Universidad de Huelva
W John Wilbur, National Center for Biotechnology Information

Document Type




Publication Date



BioMed Central

Source Publication

Genome Biology

Source ISSN


Original Item ID

doi: 10.1186/gb-2008-9-s2-s2; PubMed Central: PMCID 2559986


Nineteen teams presented results for the Gene Mention Task at the BioCreative II Workshop. In this task participants designed systems to identify substrings in sentences corresponding to gene name mentions. A variety of different methods were used and the results varied with a highest achieved F1 score of 0.8721. Here we present brief descriptions of all the methods used and a statistical analysis of the results. We also demonstrate that, by combining the results from all submissions, an F score of 0.9066 is feasible, and furthermore that the best result makes use of the lowest scoring submissions.


Published version. Genome Biology, Vol. 9, Suppl. 2 (2008). DOI. © 2008 Smith et al; licensee BioMed Central Ltd.

This is an open access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.