Humboldt-Universität zu Berlin - Collaborative Research Center for Theoretical Biology

Correlating regulatory DNA-sequences and gene expression data bycomparative analysis of non-coding sequences of man and mouse

The availability of complete genomes for several organisms has opened up new possibilities of studying gene regulatory mechanisms and in particular cis-regulatory elements. The gene regulation group focuses on the delineation of regulatory motifs and interactions based on an integration of a variety of information sources. In yeast, where extensive protein-protein interaction data have been generated, this information can serve to aid in the identification of regulatory modules. In mammalia, comparison of non-coding, upstream sequences of orthologous genes can pinpoint regions that are likely to have a regulatory role. This can be extended by comparing sequences to binding site descriptors that have been collected in publicly available databases. Microarray generated gene expression data may further serve to understand regulatory interactions between genes.

Comparative sequence analysis of two or more genomes is an appropriate tool to investigate gene structure and surrounding functional elements in the vast sequence space of non-coding DNA. This assumption is validated by the observation that experimentally verified transcription factor binding sites map to highly conserved regions in man-mouse sequence comparisons. An initial large-scale in silico study on sequence conservation upstream of the translational start site demonstrated the power of the comparative approach [1]. Our principal repository for annotated conserved blocks (CNBs) in homologous upstream regions of man and mouse is CORG, the database for Comparative Regulatory Genomics [2]. CORG contains a precomputed set of CNBs for the upstream regions of more than 12,000 orthologous gene groups. The origin of sequence conservation is often explained by the functional annotation of the CNBs. We distinguish untranslated exons from other conserved regions by screening all CNBs with pre-assembled EST clusters. Here, an important part of our research concerns the reliable annotation of transcription factor binding sites within CNBs.

An subsequent step is to associate evolutionarily conserved predicted binding sites with complementary biological data like time-course microarray data. In an ongoing collaboration, suggested downstream genes of the transcription factor SRF have been scanned for evolutionarily conserved SREs (the SRF binding site). These hypothetical direct target genes are currently under investigation in the laboratory of A. Nordheim (Tübingen). A further study was performed on another much-studied biological process: the response of dendritic cells to LPS, a component of the cell wall of gram-negative bacteria [3]. An analysis of the upstream regions of genes that appear to be co-regulated in the respective microarray experiment allows to identify the endpoints of Toll-receptor signalling which is involved in this pathway. Likewise, regulation of the cell cycle in human (HeLa) cells has been studied. Some transcription factor binding sites, like those of the E2F family, show a strong enrichment in the upstream regions of genes that fall into particular cell cycle phases. We have now initiated a cooperation with the research group of Constance Scharff at the MPI to assess the impact of selected transcription factors on cell cycle progression using RNA-interference technology.

[1] Annotating regulatory DNA based on man-mouse genomic comparison. Dieterich C, Wang H, Rateitschak K, Krause A, Vingron M. Bioinformatics. 2002 Oct;18 Suppl 2:S84-90.

[2] CORG: a database for Comparative Regulatory Genomics. Dieterich C, Wang H, Rateitschak K, Luz H, Vingron M. Nucleic Acids Res. 2003 Jan 1;31(1):55-7.

[3] Exploring potential target genes of signalling pathways by predicted conserved transcription factor binding sites. Dieterich C., Herwig R., Vingron M. - Bioinformatics (to appear)

description of the 2nd period german version
description of the current period