SELEX_DB is an online resource containing both the experimental data on in vitro selected DNA/RNA oligomers (aptamers) and the applets for these oligomers recognition. In vitro selection of oligomers binding target proteins is a novel technology intensively being developed during the last decade, for sieving a pool of synthetic oligomers through repeated cycles of PCR-amplification and protein-binding selection (1). According to Human Genome Annotation, we have developed the SELEX_DB database on oligomers selected in vitro, the database being supplied by Web-available applets for site recognition (2). Besides, since many disease may be caused not only by the mutation-altered transcription factor binding true-site on DNA, but also by the appearance of a novel protein-fitting noise-site altering a normal regulation of a gene network (e.g., the substitution -376G>A in human TNF gene promoter produces the transcription factor OCT-fitting noise-site causing the clinical phenotype 'severe malaria' (3)), the in vitro selected aptamers are very informative for the Single Nucleotide Polymorphism (SNP) analysis. At the same time, in prokaryota, the discrepancies between in vitro selected and natural sites by nucleotide-position frequency matrices have been comprehensively demonstrated (4). Besides, the positional Information Content matrices of the in vitro selected aptamers was found to be correlating with the protein-binding strength magnitudes, whereas neither correlation was found for the corresponding natural site (5). This means that, in prokaryota, natural sites were selected in vivo according to their biological activity, but not by protein affinity. In eukaryota, the relationship between in vivo and in vitro selections seems to be very knotty. From one hand, the in vitro selected TBP-binding DNAs provide the natural TATA-box activity (6). Moreover, homologous c-Myb and v-Myb proteins, minimal Myb/DNA-binding domain and, Myb-fortified cell nuclear extract are selecting in vitro the aptamers, similarities of which to one the others and to the natural c-Myb sites are significant (7). From the other hand, in vitro selected YY1-binding DNAs, inserted into the plasmids and transfected into various cells ('plasmid+cell' system), repress the reported gene (8), thus supporting the fact that YY1 binding strength and repression magnitudes do not correlate. Moreover, for these in vitro selected YY1-fitting aptamers, these YY1-caused repression measured in vivo at one 'plasmid+cell' system do not correlate to the proper magnitudes detected in vivo at the other 'plasmid+cell' system. According to this evident system-dependence of both in vitro and in vivo selected experimental data, nowadays a fundamental question is how in vitro selected data could be implemented for natural gene analysis (1). That is why novel release of the SELEX_DB has been supported by two databases SYSTEM (9, 10) and CROSS_TEST, storing both experiment systems and their cross-validation tests. By cross-validation testing, we have unexpectedly observed that, for a fixed protein-binding site, the recognition accuracy increases with the growth of homology between the target and test proteins. For natural sites, the recognition accuracy was less than for the nearest protein homologs and higher than for the distant homologs and non-homologous proteins binding the common site. The current SELEX_DB release is available at URL=http://wwwmgs.bionet.nsc.ru/mgs/systems/selex/.
nucleic acid sequence transcription rna sequence