If the your browser is internet explorer 5 or any older version, you are only able to read the content of this page, but not to see the layout.

In-Silico Analysis of Proteins

Celebrating the 20th anniversary of Swiss-Prot

July 30 - August 04, 2006 : Fortaleza, Brazil

Poster #RP128

Alternative promoters related with protein diversity.

Riu Yamashita*, Katsuki Tsuritani*, Yutaka Suzuki**, Sumio Sugano**, Kenta Nakai*

*Human Genome Center: Univ. of Tokyo, Tokyo, Japan; **Department of Medical Genome Sciences, Graduate School of Frontier Sciences, Univ. of Tokyo, Chiba, Japan

We have constructed the DataBase of Transcription Start Sites (DBTSS) (http://dbtss.hgc.jp) to define precise TSSs. DBTSS contains the information of accurate transcription start sites (TSSs) based on experimentally determined 5'-end clones. It currently contains 1,359,000 clones corresponding to 15,262 human genes, as well as 364,487 clones corresponding to 14,162 mouse genes. It also contains 32,263 zebrafish, 10,236 malaria, and 22,923 schyzon 5'-end clones. We observed on average 2.0 of alternative promoters (APs) per human gene. This suggests that APs are one of the factors responsible for protein diversity.
To validate the quality of our data, we compared them with the EPD data. There are 1,871 human promoters collected from literature in EPD. Among them, we could map 1,767 promoter sequences to the human genome. 1,639 promoters (92.8%) were mapped within 100 bases of the DBTSS TSSs. In fact, EPD contains data based on a full-length cDNA project that uses the same experimental techniques as the cDNA technique used by DBTSS. Therefore, we removed these data, and compared the remaining data with DBTSS alternative promoters. There are 138 alternative promoters in the reduced EPD dataset, 108 (80.6%) corresponding to those of DBTSS.