DNA RESEARCH 20, 273–286, (2013) doi:10.1093/dnares/dst009 Advance Access publication on 9 April 2013 Re-Annotation of Protein-Coding Genes in 10 Complete Genomes of Neisseriaceae Family by Combining Similarity-Based and Composition-Based Methods FENG-BIAO Guo1,2,LIFENG Xiong1,JADE L. L. Teng1,KWOK-YUNG Yuen1,3,4,SUSANNA K. P. Lau1,3,4,*, and PATRICK C. Y. Woo1,3,4,* Department of Microbiology, The University of Hong Kong, Special Administrative Region, Hong Kong, People’s Republic of China1; School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, People’s Republic of China2; State Key Laboratory of Emerging Infectious Diseases, The University of Hong Kong, Special Administrative Region, Hong Kong, People’s Republic of China3 and Research Centre of Infection and Immunology, The University of Hong Kong, Special Administrative Region, Hong Kong, People’s Republic of China4 *To whom correspondence should be addressed. Tel. 852-22554897. Fax. 852-28551241. E-mail:
[email protected] (P.C.Y.W.) or
[email protected] (S.K.P.L.) Edited by Prof. Kenta Nakai (Received 21 September 2012; accepted 8 March 2013) Abstract In this paper, we performed a comprehensive re-annotation of protein-coding genes by a systematic method combining composition- and similarity-based approaches in 10 complete bacterial genomes of the family Neisseriaceae. First, 418 hypothetical genes were predicted as non-coding using the compos- ition-based method and 413 were eliminated from the gene list. Both the scatter plot and cluster of ortho- logous groups (COG) fraction analyses supported the result.