The identification of the complete set of functions of any organism provides the foundation upon which our understanding of the biology of that organism rests. In essence, it forms the basic framework that any genome project targets, and from which any biological interpretation originates. However, while the quality and quantity of sequencing data has dramatically increased during the last few years, their interpretation remains a major bottleneck. In fact, as more and more microbes are sequenced, the scientific community's efforts to assign functions to genes are lagging. Additionally, the current public databases contain low quality data, and a high rate of error propagation while the importance of comparative analysis and extensive sequence integration for a comprehensive genome analysis and reconstruction of the functional cellular subsystems (e.g. metabolic pathways, information processes etc.) has been largely overlooked by most contemporary genome databases. To speed up annotation, group members are developing software tools for determining microbial gene function.
A primary concern of the Genome Biology Program is the quality of the functional annotation and the comparative analysis of microbial genomes and communities. To address the problems mentioned above we are developing functional classification schemes for the microbial organisms within the IMG system framework, including the design of (a) controlled vocabularies for the representation of gene function (IMG Terms), as well as (b) metabolic and non-metabolic pathway collections (IMG Pathways and IMG Parts Lists). These development s are leading to the generation of general functional overviews from the sequenced genomes.
More information about our curation efforts is available in the IMG system under IMG Terms, pathways and Parts Lists.
For a current version of the curated Protein function and Pathways, you can visit the
(a) IMG Term page in IMG
(b) IMG Pathways page in IMG
(c) IMG Parts Lists page in IMG