The Parasite Gen omics Group plan to publish the annotated sequence in a peer reviewed journal in the coming future. The E. tenella genome database was explored to identify genes that were automatically predicted to code for aspartic, cysteine, metallo kinase inhibitor MG132 and serine proteases. Database mining revealed over 60 gene sequences whose predicted open reading frames were associated with potential peptidase activity. Manual annotation of the genes was performed by BLAST search of apicomplexan genome databases to identify phylogenetically closely related nucleotide sequences and by BLAST search of various protein data bases to identify the most closely related, experimentally characterized homologs available. Additionally, the predicted proteins were analysed for conserved motifs and domains to further validate protein function.
Each predicted protein was then assigned a five tiered level of confidence for function using an Evidence Rating system. The evidence rat ing system, described previously, allocates genes an overall score, indicating how compelling the bioinformatic and experimental evidence is for protein function. An ER1 rating signifies extremely reliable experimental data to support protein function in the particular species being investigated, in this case Eimeria, whereas ER5 indicates no experimental or bio informatic evidence for gene function. Genes with an ER5 were eliminated from further investigation.
After this validation process was performed, 45 putative prote ase genes remained and these could be classified into clans and families of aspartic, cysteine, metallo and serine proteases, including, three aspartic pro teases, all within family A1 in clan AA, 16 cysteine pro teases, the vast majority of which were in clan CA, five being cathepsins, one calpain, eight ubiquitinyl hydrolases and one OTU protease, as well as a single clan CF pyroglutamyl peptidase, 14 metallo pro teases, distributed over five clans, ME, MF, MK and MM and seven families, M41, M48, M16, M17, M22 and M50, and 12 serine proteases in clan PA, clan SB, clan SC, clan SK and clan ST. Three additional rhomboid proteases were identified in the E. tenella genome data base by using BLASTP to search the database using, as queries, homologs described in T. gondii, rhomboid protease 3, rhomboid protease 4, and rhomboid protease 5.
How ever, we were unable to confirm coding sequences or stage specific expression for any of these three genes. Stage specific protease gene expression To assess the stage specific gene expression of putative proteases identified in the E. tenella database, different stages of the parasite lifecycle were isolated and total RNA purified. These stages included merozoites, Brefeldin_A 134 h gametocytes, unsporulated oocysts, sporulated oocysts as well as uninfected caeca control tissue. RT PCR was performed and the stage specific cDNA samples were subjected to control PCRs to determine purity.