Genome properties The genome consist of a 2,147,060 bp long chromosome and four large circular plasmids of 315,518 bp, 195,800 bp, 132,270 bp, and 97,188 bp length, and a G+C content reference 2 of 65.6% (Table 3 and Figure 3). Of the 2,799 genes predicted, 2,741 were protein-coding genes, and 58 RNAs; 85 pseudogenes were also identified. The majority of the protein-coding genes (65.0%) were assigned a putative function while the remaining ones were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4. Table 3 Genome Statistics Figure 3 Graphical circular map of the chromosome (plasmids not shown, but accessible through the img/er pages on the JGI web pages [48]); From outside to the center: Genes on forward strand (color by COG categories), Genes on reverse strand (color by COG categories), .

.. Table 4 Number of genes associated with the general COG functional categories Acknowledgements We would like to gratefully acknowledge the help of Katja Steenblock (DSMZ) for growing D. proteolyticus cultures. This work was performed under the auspices of the US Department of Energy Office of Science, Biological and Environmental Research Program, and by the University of California, Lawrence Berkeley National Laboratory under contract No. DE-AC02-05CH11231, Lawrence Livermore National Laboratory under Contract No. DE-AC52-07NA27344, and Los Alamos National Laboratory under contract No. DE-AC02-06NA25396, UT-Battelle and Oak Ridge National Laboratory under contract DE-AC05-00OR22725, as well as German Research Foundation (DFG) INST 599/1-2.

The phylogenetic relationship of the 16S rRNA gene of C. clariflavum DSM 19732 with other cellulolytic clostridia from Cluster III is shown in Figure 1. The sequences shown in here represent mostly cellulolytic and xylanotlytic clostridia sharing over 84.5% sequence identity. The branch comprised by C. clariflavum, C. straminisolvens and C. thermocellum is of particular interest since it includes cellulolytic organisms sharing at least 96.6% sequence homology able to grow at thermophilic temperatures. A few environmental samples have provided sequences with close homology (>99.0% sequence similarity) to the C.

clariflavum 16S rRNA gene, and have been found in thermophilic methanogenic bioreactors [7], enrichment cultures from bioreactors (Accession number “type”:”entrez-nucleotide”,”attrs”:”text”:”AB231801″,”term_id”:”146197920″,”term_text”:”AB231801″AB231801 GSK-3 and “type”:”entrez-nucleotide”,”attrs”:”text”:”AM408567″,”term_id”:”116806963″,”term_text”:”AM408567″AM408567), and enrichments from thermophilic compost [3]. Two pure cultures have been isolated from compost enrichments with >99.7% sequence similarity to C. clariflavum and able to utilize xylan [4]. However, no evidence of this organism has been reported in metagenomic studies from similar environments.

