Document Type


Date of Original Version



Biological Sciences


To build a foundation for the complete genome analysis of Bombyx mori, we have constructed an EST database. Because gene expression patterns deeply depend on tissues as well as developmental stages, we analyzed many cDNA libraries prepared from various tissues and different developmental stages to cover the entire set of Bombyx genes. So far, the Bombyx EST database contains 35,000 ESTs from 36 cDNA libraries, which are grouped into ≈11,000 nonredundant ESTs with the average length of 1.25 kb. The comparison with FlyBase suggests that the present EST database, SilkBase, covers >55% of all genes of Bombyx. The fraction of library-specific ESTs in each cDNA library indicates that we have not yet reached saturation, showing the validity of our strategy for constructing an EST database to cover all genes. To tackle the coming saturation problem, we have checked two methods, subtraction and normalization, to increase coverage and decrease the number of housekeeping genes, resulting in a 5–11% increase of library-specific ESTs. The identification of a number of genes and comprehensive cloning of gene families have already emerged from the SilkBase search. Direct links of SilkBase with FlyBase and WormBase provide ready identification of candidate Lepidoptera-specific genes.