HGT-DB, Horizontal Gene Transfer-DataBase

The HGT-DB is a genomic database that includes statistical parameters such as G+C content, codon and amino-acid usage, as well as information about which genes deviate in these parameters for prokaryotic complete genomes. Under the hypothesis that genes from distantly related species have different nucleotide compositions, these deviated genes may have been acquired by horizontal gene transfer. The methods used to consider whether a gene is extraneous in terms of G+C content or codon usage and a candidate to be acquired by HGT are described in Garcia-Vallve et al. 2000. The HGT-DB is organized by genome i.e. every prokaryotic genome that has been completely sequenced forms a new entry. Different chromosomes from the same organism, or genomes from the same species but different strains, are found in different entries. The current version of the database contains 88 genomes that are sorted alphabetically and classified taxonomically. For each genome, the database provides statistical parameters for all the genes, as well as averages and standard deviations of G+C content, codon usage, relative synonymous codon usage and amino-acid content. It also provides information about correspondence analyses of the codon usage, plus lists of extraneous group of genes in terms of G+C content, lists of putatively acquired genes and a tab-delimited file with all the statistical calculations for each gene of a genome. The fields available for each gene in these files are: information about its position (coordinates, strand and length), gene name, function, the Cluster of Orthologous Group, COG it belongs to, total and positional G+C content, the Mahalanobis distance to the average codon usage, amino-acid content deviations, if any, and a prediction of whether the gene belongs to a region with a high or low G+C content or whether it has been acquired by HGT. This information can be also accessed via a search engine that allows searches for gene names or keywords for a specific organism. When searching for a gene name, one can also view the upstream and downstream genes. With this information, researchers can explore the G+C content and codon usage of a gene when they find incongruences in sequence-based phylogenetic trees. HGT-DB is freely accessible at http://www.fut.es/~debb/HGT.



genomics prokaryotic genome

More to explore:


Need help integrating and/or managing biomedical data?