-  Exclude results that include this term.
 + Only return results that contain this term.
"" All words within quotes are treated as a single term.

Found 8 tags

rna sequence 304 protein sequence 209 nucleic acid sequence 185 sequence analysis 104 dna sequences 97 sequence assembly 49 sequence sites, features and motifs 35 sequence annotation 31

Found 345 sources
Source Match ReputationScore*

Exome Aggregation Consortium Browser


The Exome Aggregation Consortium (ExAC) is a coalition of investigators seeking to aggregate and harmonize exome sequencing data from a variety of large-scale sequencing projects, and to make summary data available for the wider scientific community. ...
1
38%

Genebass


Genebass is a resource of exome-based association statistics, made available to the public. The dataset encompasses 3,817 phenotypes with gene-based and single-variant testing across 281,852 individuals with exome sequence data from the UK Biobank.
2
26%

ESP


NHLBI Exome Sequencing Project (ESP): Exome Variant Server (EVS) for browsing single nucleotide variation data from exome sequencing experiments mainly focused on heart, lung and blood disorders.
3
27%

Wellcome Sanger Institute: Whole Exome Sequencing


There is a substantial overlap between the NIHR IBD BioResource and the IBD UK Genetics Consortium (IBDGC). The NIHR BioResource provides some DNA samples. IBDGC data is being provided by the Wellcome Sanger Institute, who are performing the sequenci ...
4
23%

dbMTS


dbMTS is a comprehensive database of putative human microRNA target site (MTS) SNVs and their functional predictions. dbMTS collects all potential SNVs microRNA target seed regions in human 3’UTRs and provides their functional predictions and annotat ...
5
25%

Practical Haplotype Graph


Platform for storing and using pangenomes for imputation.
6
22%

Sequence Ontology


SO is a collaborative ontology project for the definition of sequence features used in biological sequence annotation. The Sequence Ontology is a set of terms and relationships used to describe the features and attributes of biological sequence. SO i ...
7
49%

SomaMutDB


A database of somatic mutations in normal human tissues.
8
23%

CanVaS


CanVaS is a Greek cancer patient genetic variation resource.
9
22%

Gene4Denovo


an integrated database and analytic platform for de novo mutations in humans. De novo mutations (DNMs) significantly contribute to sporadic diseases, particularly in neuropsychiatric disorders. Whole-exome sequencing (WES) and whole-genome sequencin ...
10
26%

KRGDB


The large-scale variant database of 1722 Koreans based on whole genome sequencing.
11
26%

DDBJ Sequence Read Archive


DDBJ Sequence Read Archive (DRA) is an archive database for output data generated by next-generation sequencing machines including Roche 454 GS System®, Illumina Genome Analyzer®, Applied Biosystems SOLiD® System, and others. DRA is a member of the I ...
12
23%

DDBJ Trace Archive


DDBJ Trace Archive (DTA) is a permanent repository of DNA sequence chromatograms (traces), base calls, and quality estimates for single-pass reads from various large-scale sequencing projects. DTA is a member of the International Nucleotide Sequence ...
13
22%

COGVIC


COGVIC(Catalogue Of Germline Variants In Cancer). A comprehensive database of germline pathogenic variants in East Asian pan-cancer patients.
14
22%

CNVIntegrate


Multi-ethnic database for identifying copy number variations associated with cancer. View gene-centric CNV profile collected from healthy individuals and multiple cancer types.
15
22%

Sequence Read Archive


The Sequence Read Archive (SRA) stores raw sequencing data from the next generation of sequencing platforms Data submitted to SRA. It is organized using a metadata model consisting of six objects: study, sample, experiment, run, analysis and submissi ...
16
66%

TMC-SNPdb 2.0


An ethnic-specific database of Indian germline variants.
17
22%

GenBank Sequence Format


GenBank Sequence Format (GenBank Flat File Format) consists of an annotation section and a sequence section. The start of the annotation section is marked by a line beginning with the word "LOCUS". The start of sequence section is marked by a line be ...
18
24%

AstraZeneca PheWAS Portal


The AstraZeneca PheWAS Portal is a public repository of gene-phenotype associations for phenotypes derived from electronic health records, questionnaire data, and continuous traits. These data were generated using exome sequencing and phenotype data ...
19
26%

Uniclust


Clustered protein sequences and multiple sequence alignments
20
22%

gnomAD


Genome Aggregation Database (gnomAD) - browser that aggregates exome and whole-genome sequencing data from a wide variety of large-scale sequencing projects. It enables search of genetic variation information by gene, variant or region.
21
53%

Sequence Alignment Map


The Sequence Alignment/Map (SAM) format is a TAB-delimited text format consisting of a header section, which is optional, and an alignment section.
22
68%

MPS6


Review and classification of published variants in the ARSB gene. The purpose of this database is to support researchers and clinicians. understand structural changes on alylsulfatase B (ASB) caused by Mucopolysaccharidosis type VI (MPS6) mutations ...
23
27%

Database of Sequence Tagged Sites


dbSTS is an NCBI resource that contains sequence data for short genomic landmark sequences or Sequence Tagged Sites.
24
38%

Genome Variation Format


The Genome Variation Format (GVF) is a very simple file format for describing sequence alteration features at nucleotide resolution relative to a reference genome.
25
24%

UK Biobank


UK Biobank is a large-scale biomedical database and research resource that provides researchers access to detailed longitudinal phenotype, medical and genetic data from 500,000 volunteer participants.
26
33%

SIMAP


Protein sequences are of utmost importance for studying the function and evolution of genes and genomes. Therefore a rich collection of methods in computational biology relies on the analysis and comparison of protein sequences. Many of these intensi ...
27
27%

Human Genetic Variation Database


The Human Genetic Variation Database (HGVD) aims to provide a central resource to archive and display Japanese genetic variation and association between the variation and transcription level of genes. The database currently contains genetic variation ...
28
35%

FASTQ Sequence and Sequence Quality Format


FASTQ is a text-based file format for sharing sequencing data combining both the sequence and an associated per base quality score.
29
44%

DBSAV database


DBSAV database reports GTS scores of human genes and DeepSAV scores of SAVs in the human proteome, including pathogenic SAVs, benign SAVs, gnomAD SAVs observed in exome sequencing, and all possible SAVs by single nucleotide variations. Each human pro ...
30
23%

dbNSFP


Database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) and splice-site variants (ssSNVs) in the human genome. It also facilitates the steps of filtering and prioritizing SNVs fr ...
31
36%

GenBank


GenBank is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. The complete release notes for the current version of GenBank are available on the NCBI ftp site. A new release is made every two months. G ...
32
72%

CMPD


MPD is designed for providing a comprehensive, integrated and well-annotated resource, focusing on protein sequence-altering variations originated from both germline and cancer-associated somatic variations. The mutated protein sequence pool was base ...
33
22%

Feature Annotation Location Description Ontology


The Feature Annotation Location Description Ontology (FALDO), to describe the positions of annotated features on linear and circular sequences for data resources represented in RDF and/or OWL. FALDO can be used to describe nucleotide features in sequ ...
34
28%

INSD sequence record XML


The International Nucleotide Sequence Database Collaboration (INSDC) is a long-standing foundational initiative that operates between DDBJ, EMBL-EBI and NCBI. INSDC covers the spectrum of data raw reads, though alignments and assemblies to functional ...
35
26%

ATAV


ATAV is a comprehensive platform for population-scale genomic analyses. ATAV stores variant and per site coverage data for all samples in a centralized database, which is efficiently queried by ATAV to support diagnostic analyses for trios and single ...
36
22%

PhenomeCentral


Repository for clinicians and scientists working in the rare disorder community. It enables secure sharing of case records by clinicians and rare disease scientists and helps the user to find additional cases of the same unnamed disorder. The reposit ...
37
29%

EBI patent sequences


Non-redundant databases of patent DNA and protein sequences
38
26%

openSNP


A crowdsourced collection of personal genomics data. Includes SNP genotyping, exome sequencing data, phenotypic annotation and quantified self tracking data.
39
29%

Minimum Information about any (x) Sequence


The minimum information about any (x) sequence (MIxS) is an overarching framework of sequence metadata, that includes technology-specific checklists from the previous MIGS and MIMS standards, provides a way of introducing additional checklists such a ...
40
39%

NCBI Trace Archives


The Trace Archives includes the following archives: The Sequence Read Archive (SRA) stores raw sequence data from "next-generation" sequencing technologies including 454, IonTorrent, Illumina, SOLiD, Helicos and Complete Genomics. In addition to raw ...
41
24%

Reference Sequence Annotation


An ontology for sequence annotations and how to preserve them with reference sequences.
42
26%

UniParc


The UniProt archive (UniParc), part of the UniProt databases, is an archival protein sequence collection from all major publicly accessible resources. New and revised protein sequences are added daily into UniParc while not deleting the previous vers ...
43
25%

ENA Sequence Flat File Format


ENA Sequence Flat File Format is a standardised plain text format for nucleotide sequences. This format was previously called the EMBL Sequence Flat File Format.
44
24%

CGGA


The Chinese Glioma Genome Atlas (CGGA) is a user-friendly web application for data storage and analysis to explore brain tumors datasets. This database includes the whole-exome sequencing, DNA methylation, mRNA sequencing, mRNA microarray and microRN ...
45
22%

UniRef


The UniProt Reference Clusters are three separate datasets that compress sequence space at different resolutions, achieved by merging sequences and sub-sequences that are 100% (UniRef100), >=90% (UniRef90), or >=50% (UniRef50) identical, regardless o ...
46
43%

DNA Data Bank of Japan


An annotated collection of all publicly available nucleotide and protein sequences. DDBJ collects sequence data mainly from Japanese researchers, as well as researchers in other countries. DDBJ is part of the International Nucleotide Sequence Databas ...
47
40%

UCSC Genome Browser database


Genome assemblies and aligned annotations for a wide range of vertebrates and model organisms, along with an integrated tool set for visualizing, comparing, analyzing and sharing both publicly available and user-generated genomic datasets.
48
88%

Berkeley Drosophila Genome Project EST database


The goals of the Drosophila Genome Center are to finish the sequence of the euchromatic genome of Drosophila melanogaster to high quality and to generate and maintain biological annotations of this sequence.
49
23%

Pfam


The Pfam database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs). Pfam also generates higher-level groupings of related entries, known as clans. A clan is a collection of Pf ...
50
76%

FASTA Sequence Format


FASTA format is a text-based format for representing either nucleotide sequences or peptide sequences, in which nucleotides or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede th ...
51
53%

European Nucleotide Archive


The European Nucleotide Archive (ENA) is a globally comprehensive data resource for nucleotide sequence, spanning raw data, alignments and assemblies, functional and taxonomic annotation and rich contextual data relating to sequenced samples and expe ...
52
52%

Mitochondrial Disease Sequence Data Resource


The Mitochondrial Disease Sequence Data Resource (MSeqDR) is a centralized genome and phenome bioinformatics resource built by the mitochondrial disease community to facilitate clinical diagnosis and research investigations of individual patient phen ...
53
32%

NCBI Third Party Annotation


TPA is a database that contains sequences built from the existing primary sequence data in GenBank. TPA records are retrieved through the Nucleotide Database and feature information on the sequence, how it was cataloged, and proper way to cite the se ...
54
22%

SYSTERS


The integration of SYSTERS, GeneNest and SpliceNest into one framework facilitates the over-all exploration of the whole sequence space covering protein, mRNA and EST sequences, as well as genomic DNA. The SYSTERS protein sequence cluster set provide ...
55
22%

Dfam


The Dfam database is a open collection of DNA Transposable Element sequence alignments, hidden Markov Models (HMMs), consensus sequences, and genome annotations. Dfam represents a collection of multiple sequence alignments, each containing a set of r ...
56
39%

cis-Regulatory Element Database


The cisRED database holds conserved sequence motifs identified by genome scale motif discovery, similarity, clustering, co-occurrence and coexpression calculations. Sequence inputs include low-coverage genome sequence data and ENCODE data.
57
34%

Binary sequence information Format


A .2bit file stores multiple DNA sequences (up to 4 Gb total) in a compact randomly-accessible format. The file contains masking information as well as the DNA itself. The DNA sequence is represented as two bits per pixel with associated list of regi ...
58
22%

SuperSite


Dictionary of binding sites in proteins
59
22%

PANDIT


PANDIT is a collection of multiple sequence alignments and phylogenetic trees covering many common protein domains. It contains the seed protein sequence alignments from the Pfam-A (curated families) database; nucleotide sequence alignments derived f ...
60
23%

mESAdb


microRNA Expression and Sequence Analysis Database
61
28%

Genome Warehouse


The Genome Warehouse (GWH) is a public archival resource housing genome-scale data for a wide range of species. GWH accepts a variety of data types, including whole genome, chloroplast, mitochondrion and plasmid. For each collected genome assembly, G ...
62
36%

SEVENS


Seven-transmembrane-helix receptors (7-TMR), known as G-protein-coupled receptors [1], are important genes that work as the gateway of signal transudation induced by ligand binding. Recent progress in determination of human draft sequences [2,3] acce ...
63
22%

Expressed Sequence Tags database


The dbEST contains sequence data and other information on "single-pass" cDNA sequences, or "Expressed Sequence Tags", from a number of organisms. NCBI is in the process of merging EST and GSS records into the Nucleotide database, and the process is e ...
64
42%

GISSD


Group I Intron Sequence and Structure Database
65
22%

YeTFaSCo


Yeast Transcription Factor binding Site sequence Collection
66
22%

cpnDB


Chaperonins are a diverse family of molecular chaperones present in the plastids, mitochondria, and cytoplasm of eukaryotes, and in bacteria and archaea. The family is divided into group I (CPN60, also known as Hsp60 or GroEL, found in bacteria, some ...
67
31%

ENA Sequence XML Schema


ENA Sequence XML Schema is a standardised XML schema for nucleotide sequences. All assembled and annotated sequences must conform to this schema.
68
24%

CoPS


Comprehensive peptide signature database
69
22%

O-GLYCBASE


O-GLYCBASE is a database of glycoproteins with O-linked and C-linked glycosylation sites. Entries with at least one experimentally verified glycosylation site have been compiled from protein sequence databases and literature. Each entry contains info ...
70
22%

OryGenesDB: an interactive tool for rice reverse genetics


The aim of this Oryza sativa database was first to display sequence information such as the T-DNA and Ds flanking sequence tags (FSTs) produced in the framework of the French genomics initiative Genoplante and the EU consortium Cereal Gene Tags. This ...
71
27%

NRichD


Efficiency of protein remote homology detection methods depends on the dispersion of the protein sequence space and the availability of intermediate sequences between two related protein families. In the absence of any structural evidence and natural ...
72
22%

PRF


Protein research foundation database of peptides: sequences, literature and unnatural amino acids
73
22%

Peptaibol


The Peptaibol Database is a sequence and structure resource for the unusual class of peptides known as peptaibols. The database includes sequence, biological source, and bibliographical data for the naturally-occurring peptaibols. Information is also ...
74
22%

Progenetix - genomic copy number aberrations in cancer


The Progenetix database provides an overview of copy number abnormalities in human cancer from Comparative Genomic Hybridization (CGH) experiments. With 30817 cases from 1016 publications (Oct 2013), Progenetix is the largest curated database for who ...
75
40%

resiDB


ResiDB is a user-friendly sequence similarity-dependent database manager for bacteria, fungi, viruses, protozoa, invertebrate, plants, archaea, environmental and whole genome shotgun sequence data. Create a new database Access existing databases Loa ...
76
23%

Genome Sequence Archive


GSA is a data repository specialized for archiving raw sequence reads. It supports data generated from a variety of sequencing platforms ranging from Sanger sequencing machines to single-cell sequencing machines and provides data storing and sharing ...
77
39%

ProTeus


Signature sequences at the protein N- and C-termini
78
22%

The UCSC Archaeal Genome Browser


The UCSC Archaeal Genome Browser is a window on the biology of more than 100 microbial species from the domain Archaea. Basic gene annotation is derived from NCBI Genbank/RefSeq entries, with overlays of sequence conservation across multiple species, ...
79
56%

HMMER Profile File Format


The profile hidden Markov Model (HMM) calculated from multiple sequence alignment data in this service is stored in Profile HMM save format (usually with ".hmm" extension). It is an ASCII file containing a lot of header and descriptive records follow ...
80
36%

Major Intrinsic Proteins Modification Database


This is a database of comparative protein structure models of the MIP (Major Intrinsic Protein) family of proteins. The MIPs have been identified from the completed genome sequence of organisms available at NCBI.
81
35%

Ocean Gene Atlas


The Ocean Gene Atlas service provides data mining access to three complementary data objects: gene sequence catalogs (ENA), sample environmental context (PANGAEA), and gene abundances estimates in samples (computed by mapping sequence reads onto gene ...
82
28%

Proteomics Standards Initiative Extended Fasta Format


The PSI Extended Fasta Format (PEFF) is a unified format for protein and nucleotide sequence databases to be used by sequence search engines and other associated tools (spectra library search tools, sequence alignment software, data repositories, etc ...
83
28%

DTU Bioinformatics


CBS offers Comprehensive public databases of DNA- and protein sequences, macromolecular structure, g ene and protein expression levels, pathway organization and cell signalling, have been established to optimise scientific exploitation of the explosi ...
84
22%

DriverDBv2


DriverDB, a database that incorporates >9500 cancer-related RNA-seq datasets and >7000 more exome-seq datasets, in addition to annotation databases and published bioinformatics algorithms dedicated to driver gene/mutation identification. Seven additi ...
85
30%

Universal PBM Resource for Oligonucleotide Binding Evaluation


The UniPROBE (Universal PBM Resource for Oligonucleotide Binding Evaluation) database hosts data generated by universal protein binding microarray (PBM) technology on the in vitro DNA binding specificities of proteins.
86
39%

miRBase


The miRBase database is a searchable database of published miRNA sequences and annotation. Each entry in miRBase represents a predicted hairpin portion of a miRNA transcript (termed mir in the database), with information on the location and sequence ...
87
73%

DDBJ/ENA/GenBank Feature Table


The GenBank, EMBL, and DDBJ nucleic acid sequence data banks have from their inception used tables of sites and features to describe the roles and locations of higher order sequence domains and elements within the genome of an organism. In February, ...
88
31%

CATH


The CATH database of protein domain structures (http://www.biochem.ucl.ac.uk/bsm/cath_new) currently contains 34,287 domain structures classified into 1,383 superfamilies and 3,285 sequence families. Each structural family is expanded with domain seq ...
89
22%

SCOPe


The ASTRAL compendium provides a set of tools and databases designed to aid investigators in the analysis of protein structure, particularly through the use of sequence comparison. Astral augments SCOP, a manual classification of protein domains acco ...
90
35%

PolyQ


Polyglutamine Repeats in Proteins
91
28%

ConoServer


ConoServer is a database specializing in sequences and structures of peptides expressed by marine cone snails. The database gives access to protein sequences, nucleic acid sequences and structural information on conopeptides. ConoServer's data are fi ...
92
46%

Minimum Information about a MARKer gene Sequence


MIMARKS is the metadata reporting standard of the Genomic Standards Consortium that covers marker gene sequences from environmental surveys or individual organisms
93
42%

NCBI Viral Genomes Resource


NCBI Viral Genomes Resource is a collection of virus genomic sequences that provides curated sequence data, related information and tools. It includes all complete viral genome sequences deposited in the International Nucleotide Sequence Database Col ...
94
38%

Nucleotide Sequence Database Collaboration


This database consists of a joint effort to collect and disseminate databases containing DNA and RNA sequences. It is a long-standing foundational initiative that operates between DDBJ, EMBL-EBI and NCBI. It covers the spectrum of data raw reads, th ...
95
27%

Rfam


The Rfam database is a collection of RNA families, each represented by multiple sequence alignments, consensus secondary structures and covariance models (CMs). The families in Rfam break down into three broad functional classes: non-coding RNA genes ...
96
45%

Protein Clusters


Related protein sequences (clusters)of Reference Sequence proteins encoded by complete genomes
97
22%

PDBSite


3D structure of protein functional sites
98
22%

Insertion Sequence Finder


This database provides a list of insertion sequences (IS) isolated from bacteria and archae. It is organized into individual files containing their general features (name, size, origin, family.....) as well as their DNA and potential protein sequence ...
99
46%

Enzyme Structure Function Ontology


The ESFO provides a new paradigm for organizing enzyme sequence, structure, and function information, whereby specific elements of enzyme sequence and structure are mapped to specific conserved aspects of function, thus facilitating the functional an ...
100
28%

PA-GOSUB


Protein sequences from model organisms, GO assignment and subcellular localization
101
22%

Reference Sequence Database


The Reference Sequence (RefSeq) collection aims to provide a comprehensive, integrated, non-redundant, well-annotated set of sequences, including genomic DNA, transcripts, and proteins.
102
60%

Conserved Domain Database


The Conserved Domain Database (CDD) brings together several collections of multiple sequence alignments representing conserved domains, including NCBI-curated domains, which use 3D-structure information to explicitly to define domain boundaries and p ...
103
64%

PAZAR


PAZAR is a software framework for the construction and maintenance of regulatory sequence data annotations; a framework which allows multiple boutique databases to function independently within a larger system (or information mall). The goal of PAZAR ...
104
34%

UniSave


The UniProtKB Sequence/Annotation Version database (UniSave) is a comprehensive archive of UniProtKB/Swiss-Prot a nd UniProtKB/TrEMBL entry versions. All changed Swiss-Prot and TrEMBL entries are loaded into the UniSave as part of the public UniProtK ...
105
23%

Minimotif Miner


Search tools for short functional motifs involved in posttranslational modifications, binding to other proteins, nucleic acids, or small molecules
106
22%

Lipase Engineering Database


The Lipase Engineering Database (http://www.led.uni-stuttgart.de) integrates information on sequence, structure, and function of lipases, esterases, and related proteins. Sequence data on 806 protein entries are assigned to 38 homologous families, wh ...
107
27%

FireDB


fireDB is a database of Protein Data Bank structures, ligands and annotated functional site residues. The database can be accessed by PDB codes or UniProt accession numbers as well as keywords.
108
30%

RNAcentral


RNAcentral is a free, public resource that offers integrated access to a comprehensive and up-to-date set of non-coding RNA sequences provided by a collaborating group of databases representing a broad range of organisms and RNA types.
109
36%

Locus Reference Genomic sequences


Each LRG is stable genomic DNA sequence for a region of the human genome
110
22%

Deep Sequence and Shape Motif (DESSO)


DESSO is a deep learning-based framework that can be used to accurately identify both sequence and shape regulatory motifs from the human genome.
111
25%

NucleaRDB


Families of nuclear hormone receptors
112
32%

al MENA


Middle East and North Africa (MENA) encompass very unique populations, with a rich history and encompasses characteristic ethnic, linguistic and genetic diversity. The genetic diversity of MENA region has been largely unknown. The recent availability ...
113
22%

Gramene: A curated, open-source, integrated data resource for comparative functional genomics in plants


Gramene's purpose is to provide added value to plant genomics data sets available within the public sector, which will facilitate researchers' ability to understand the plant genomes and take advantage of genomic sequence known in one species for ide ...
114
52%

Visual Database for Organelle Genome


VDOG, Visual Database for Organelle Genome is an innovative database of the genome information in the organelles. Most of the data in VDOG are originally extracted from GeneBank, re-organized and represented.
115
22%

Minimal Metagenome Sequence Analysis Standard


A proposed set of minimal standard analyses necessary for proper interpretation of meta-omic data and to allow comparative metagenomics and metatranscriptomics. Please note: We cannot find an up-to-date website for this resource. As such, we have mar ...
116
32%

Sequence-Structural Templates of Single-member Superfamilies


SSToSS is a database which provides sequence-structural templates of single member protein domain superfamilies like PASS2. Sequence-structural templates are recognized by considering the content and overlap of sequence similarity and structural para ...
117
28%

NCBI Trace Archive


The NCBI Trace Archive is a permanent repository of DNA sequence chromatograms (traces), base calls, and quality estimates for single-pass reads from various large-scale sequencing projects. The Trace Archive serves as the repository of sequencing da ...
118
22%

.ACE format


The ACE file format is a specification for storing data about genomic contigs. The original ACE format was developed for use with Consed, a program for viewing, editing, and finishing DNA sequence assemblies. ACE files are generated by various assemb ...
119
22%

EcoliWiki: A Wiki-based community resource for Escherichia coli


EcoliWiki is a community-based resource for the annotation of all non-pathogenic E. coli, its phages, plasmids, and mobile genetic elements.
120
27%

CompoDynamics


Sequence composition dynamics of genes and genomes.
121
23%

MulPSSM


Representation of multiple sequence alignments of protein families in terms of Position Specific Scoring Matrices (PSSMs) is commonly used in the detection of remote homologues. A PSSM is generated with respect to one of the sequences involved in the ...
122
22%

siRNAdb


The siRNA database provides a gene-centric view of human siRNA experimental data, including siRNAs of known efficacy and siRNAs predicted to be of high efficacy by siSearch. Linked to these sequences is information including siRNA thermodynamic prope ...
123
28%

Hits


High throughput genome (HTG) and expressed sequence tag (EST) sequences are currently the most abundant nucleotide sequence classes in the public database. The large volume, high degree of fragmentation and lack of gene structure annotations prevent ...
124
22%

Stanford HIV Drug Resistance Database


The Stanford HIV Drug Resistance Database (HIVDB) is an essential resource for public health officials monitoring ADR and TDR, for scientists developing new ARV drugs, and for HIV care providers managing patients with HIVDR.
125
43%

PASS2


PASS2 contains alignments of structural motifs of protein superfamilies. PASS2 is an automatic version of the original superfamily alignment database, CAMPASS (CAMbridge database of Protein Alignments organised as Structural Superfamilies). PASS2 con ...
126
38%

The Chromosome 7 Annotation Project


The objective of this project is to generate the most comprehensive description of human chromosome 7 to facilitate biological discovery, disease gene research and medical genetic applications.
127
34%

Bio-Mirror


A world bioinformatic public service for high-speed access to up-to-date DNA & protein biological sequence databanks.
128
25%

SEQanswers


Wiki on all aspects of next-generation genomics
129
27%

TESS


TESS (Transcription Element Search System, http://www.cbil.upenn.edu/tess) is a web-based service that searches DNA sequence for transcription factor binding sites. It integrates three databases of transcription factors and binding site models, and p ...
130
22%

PHOSIDA


Phosphorylation sites in various species identified by mass spectrometry
131
34%

Regulatory Element Database for Drosophila


REDfly is a curated collection of known Drosophila transcriptional cis-regulatory modules (CRMs) and transcription factor binding sites (TFBSs). REDfly seeks to include all experimentally verified fly regulatory elements along with their DNA sequence ...
132
36%

BMC Caller


A webtool to identify and analyze bacterial microcompartment types in sequence data.
133
22%

ASC - Active Sequence Collection


ASC (Active Sequences Collection) is a database of short amino acid sequences with known biological activity. The current version is substantially improved as compared to the previous release; it now includes more than 1300 different active short pro ...
134
22%

PIR - Protein Information Resource


The Protein Information Resource (PIR) is an integrated public bioinformatics resource that supports genomic and proteomic research and scientific studies. PIR has provided many protein databases and analysis tools to the scientific community, includ ...
135
23%

Minimal Information about any Sequence Ontology


An OWL representation of the Minimum Information for any (x) Standard (MIxS), managed by the Genomic Standards Consortium.
136
22%

Protein kinase resource


The Protein Kinase Resource (PKR) is a curated information source which provides an integrated view of sequence and structure data combined with biochemical and genetic function data focused on a single family of proteins, the protein kinases. In add ...
137
22%

PIR SuperFamily


The PIR SuperFamily concept is being used as a guiding principle to provide comprehensive and non-overlapping clustering of UniProtKB sequences into a hierarchical order to reflect their evolutionary relationships.
138
34%

RNArchitecture


RNArchitecture is a database that provides a comprehensive description of relationships between known families of structured ncRNAs, with focus on sequence and structure similarities. RNArchitecture also provides literature information and links to o ...
139
29%

Ensembl Zebrafish Genome Browser


This ensembl website features the zebrafish whole genome shotgun assembly sequence.
140
38%

NCBI Virus


NCBI Virus is a community portal for viral sequence data from RefSeq, GenBank and other NCBI repositories.
141
42%

fRNAdb


Functional RNA Database (fRNAdb) is a database service that hosts a large collection of non-coding transcripts including annotated/un-annotated sequences from H-inv database, NONCODE, and RNAdb. A set of computational sequence analyses are performed ...
142
22%

ParameciumDB


ParameciumDB is a new model organism database for Paramecium, built using components of the Generic Model Organism Database (http://www.gmod.org) construction set (Chado relational database schema, Turnkey generic web framework and Gbrowse). The data ...
143
30%

Chicken Variation Database


The chicken Variation Database (ChickVD) is an integrated information system for storage, retrieval, visualization and analysis of chicken variation data.
144
28%

PSSRdb


Polymorphic Simple Sequence Repeats Database
145
26%

PGDBj Ortholog Database


The PGDBj Ortholog Database, created under the auspices of the Plant Genome Database Japan (PGDBj), contains information about orthologous genes in plants based on their corresponding amino acid sequence similarity. By placing PGDBj Ortholog Database ...
146
26%

GPCR-SSFE


GPCR-Sequence-Structure-Feature-Extractor (SSFE). Provides template suggestions and homology models of Class A GPCRs. Identifies key sequence and structural motifs in Class A GPCRs to guide template selection and build homology models.
147
22%

Genomic Contextual Data Markup Language


The Genomic Contextual Data Markup Language (GCDML) is a core project of the Genomic Standards Consortium (GSC) that is a reference implementation the Minimum Information about a Genome Sequence (MIGS/MIMS/MIMARKS), and the extensions the Minimum Inf ...
148
33%

Organelle Genome Resource


The organelle genomes are part of the NCBI Reference Sequence (RefSeq) project that provides curated sequence data and related information for the community to use as a standard.
149
22%

Distributed Sequence Annotation System


The Distributed Annotation System (DAS) defines a communication protocol used to exchange annotations on genomic or protein sequences.
150
32%

NCBI Gene


The Entrez Global Query Cross-Database Search System is a federated search engine, or web portal that allows users to search many discrete health sciences databases at the National Center for Biotechnology Information (NCBI) website. Entrez can effic ...
151
50%

Codon Usage Database


Find GC content and frequency of codon usage for any organism that has a sequence in GenBank.
152
36%

GenomeTraFaC


GenomeTraFaC is a database of conserved regulatory elements obtained by systematically analyzing the orthologous set of human and mouse genes. It mainly focuses on all of the high-quality mRNA entries of mouse and human genes in the Reference Sequenc ...
153
28%

Integrated resource of protein families, domains and functional sites


InterPro is a resource that provides functional analysis of protein sequences by classifying them into families and predicting the presence of domains and important sites. To classify proteins in this way, InterPro uses predictive models, known as si ...
154
67%

PolyA_DB


155
22%

msRepDB


A comprehensive repetitive sequence database of over 80 000 species.
156
22%

ASTD


AltSplice and AltExtron provide information on alternative intron/exons, alternative splice events, and isoform splice patterns. AEdb contains: AEdb-Sequence (sequence and properties of alternatively splice exons), AEdb-Function (data on functional a ...
157
22%

UniProt Knowledgebase


Universal Protein resource. A database of protein sequence and functional information, many entries being derived from genome sequencing projects. It contains a large amount of information about the biological function of proteins derived from the re ...
158
100%

HOGENOM


HOGENOM is a phylogenomic database providing families of homologous genes and associated phylogenetic trees (and sequence alignments) for a wide set sequenced organisms.
159
34%

SitEx


Projections of protein functional Sites on Exons
160
26%

Ribosomal Database Project (RDP-II)


The Ribosomal Database Project - II (RDP-II)(1) provides data, tools and services related to ribosomal RNA sequences to the research community. Through its website (http://rdp.cme.msu.edu), RDP-II offers aligned and annotated rRNA sequence data, anal ...
161
39%

Ensembl Compara


Ensembl Compara provides cross-species resources and analyses, at both the sequence level and the gene level.
162
38%

Gene3D


Gene3D uses the information in CATH to predict the locations of structural domains on millions of protein sequences available in public databases. Sequence data from UniProtKB and Ensembl for domains with no experimentally determined structures are s ...
163
39%

Alias


A tool for converting identifiers in which multiple aliases are used to refer to sequences. Also available as a stand-alone tool.
164
26%

Pig Genomic Informatics System


The Pig Genomic Informatics System (PigGIS) presents accurate pig gene annotations in all sequenced genomic regions. It integrates various available pig sequence data, including 3.84 million whole-genome-shortgun (WGS) reads and 0.7 million Expressed ...
165
26%

Network of Cancer Genes


The Network of Cancer Genes (NCG) contains information on duplicability, evolution, protein-protein and microRNA-gene interaction, function, expression and essentiality of cancer genes from manually curated publications . NCG also provides informatio ...
166
34%

CoxBase


CoxBase is an online platform for epidemiological surveillance, visualization, analysis and typing of Coxiella burnetii genomic sequence.
167
22%

lncRNASNP2


168
22%

NCBI Nucleotide


The NCBI Nucleotide database collects sequences from such sources as GenBank, RefSeq, TPA, and PDB. Sequences collected relate to genome, gene, and transcript sequence data, and provide a foundation for research related to the biomedical field.
169
22%

DARNED


Database of RNA Editing
170
22%

CR-EST - Crop ESTs


The crop EST database CR-EST (http://pgrc.ipk-gatersleben.de/cr-est/) is a publicly available online resource providing access to sequence, classification, clustering, and annotation data of crop EST projects at IPK Gatersleben, Germany. CR-EST curre ...
171
29%

Amordad


Database engine for comparing metagenomic data at massive scale. It first obtains the sequence signature of metagenomes and organizes them as points in high dimensional space.
172
25%

CORG - A database for COmparative Regulatory Genomics


Sequence conservation in non-coding, upstream regions of orthologous genes from man and mouse is likely to reflect common regulatory DNA sites. Motivated by this assumption we have delineated a catalogue of conserved non-coding sequence blocks and pr ...
173
22%

ProTherm


ProThermDB is a database for proteins and mutants with data on protein stability, an increase of 84% from the previous version. It contains several thermodynamic parameters such as melting temperature, free energy obtained with thermal and denaturant ...
174
22%

Rice Genome Annotation Project


This website provides genome sequence from the Nipponbare subspecies of rice and annotation of the 12 rice chromosomes. These data are available through search pages and the Genome Browser that provides an integrated display of annotation data.
175
37%

PREX


PeroxiRedoxin classification indEX
176
30%

VIRsiRNAdb


VIRsiRNAdb contains information on experimentally validated Viral siRNA/shRNA which target viral genome regions. It provides efficacy information where available, as well as the siRNA sequence, viral target and subtype, as well as the target genomic ...
177
30%

Hardwood Genomics Project


The Hardwood Genomics Project is a databases for expressed genes, genetic markers, genetic linkage maps, and reference populations. It provides lasting genomic and biological resources for the discovery and conservation of genes in hardwood trees for ...
178
29%

APPRIS


Annotates variants with biological data such as protein structural information, functionally important residues, conservation of functional domains and evidence of cross-species conservation.
179
32%

PRODORIC2


180
22%

PROSITE


PROSITE is a database of protein families and domains. PROSITE consists of documentation entries describing protein domains, families and functional sites as well as associated patterns and profiles to identify them.
181
51%

Fungal and Oomycete genomics resource


FungiDB is an integrated genomic and functional genomic database for the kingdom Fungi. The database integrates whole genome sequence and annotation and also includes experimental and environmental isolate sequence data. The database includes compara ...
182
35%

ColabFold


ColabFold databases are MMseqs2 expandable profile databases to generate diverse multiple sequence alignments to predict protein structures.
183
22%

GenomeNet


Network of database and computational resources including KEGG (pathways, interactions, etc.) and DBGET/LinkDB (an integrated database retrieval system). It also hosts several web-based tools for sequence analysis (i.e. Blast, Motif, Clustal W).
184
36%

MEROPS


The MEROPS database is an information resource for peptidases (also termed proteases, proteinases and proteolytic enzymes) and the proteins that inhibit them.
185
53%

ElastoDB


Repository for well-characterized elastin sequences to facilitate its study. The database has since expanded to include other non-elastin sequences that share elastic properties.
186
27%

Berkeley Drosophila Genome Project insitu


In early 2010 we updated the site to facilitate more rapid transfer of our data to the public database and focus our efforts on the core mission of providing expression pattern images to the research community. The original database https://www.fruit ...
187
22%

PseudoBase++


PseudoBase is a database containing structural, functional and sequence data related to RNA pseudoknots. It can be reached by its central page at http://pseudobaseplusplus.utep.edu. From here one can retrieve pseudoknot data as well as submit data fo ...
188
22%

abYsis


abYsis is a web-based antibody research system that includes an integrated database of antibody sequence and structure data. The publicly available version includes pre-analyzed sequence data from the European Molecular Biology Laboratory European Nu ...
189
33%

NCBI PopSet


NCBI PopSet collects DNA sequences to analyze the ways that populations are related by evolution. Such sequences indicate if populations originate from different members of the same species or from organisms of different species entirely.
190
22%

ValidNESs


191
22%

FlyBase


Genetic, genomic and molecular information pertaining to the model organism Drosophila melanogaster and related sequences. This database also contains information relating to human disease models in Drosophila, the use of transgenic constructs contai ...
192
56%

Minimal Information about any Sequence (MIxS) Controlled Vocabularies


Controlled vocabularies for the MIxS family of metadata checklists. See http://gensc.org/gc_wiki/index.php/MIxS for details on the MIxS checklists.
193
22%

mutLBSgeneDB


Mutations in Ligand Binding Sites gene DataBase
194
27%

MAR databases


The MAR databases is a collection of manually curated marine microbial contextual and sequence databases, based at the Marine Metagenomics Portal. This was developed as a part of the ELIXIR EXCELERATE project in 2017 and is maintained by The Center f ...
195
30%

UTRdb/UTRsite


The 5' and 3' untranslated regions of eukaryotic mRNAs may play a crucial role in the regulation of gene expression controlling mRNA localization, stability and translational efficiency. For this reason we developed UTRdb, a specialized database of 5 ...
196
23%

Connectivity Table file format


A CT (Connectivity Table) file contains secondary structure information for a RNA sequence.
197
24%

TIGR Plant Transcript Assembly database


The TIGR Plant Transcript Assemblies (TA) database (http://plantta.tigr.org) uses expressed sequences collected from the NCBI GenBank Nucleotide database for the construction of transcript assemblies. The sequences collected include expressed Sequenc ...
198
22%

PDBselect


PDBselect (http://bioinfo.tg.fh-giessen.de/pdbselect/) is a list of representative protein chains with low mutal sequence identity selected from the protein data bank (PDB) to enable unbiased statistics. The list increased from 155 chains in 1992 to ...
199
30%

Database of small human non-coding RNAs


Integrated annotation and sequencing-based expression data for all major classes of human small non-coding RNAs (sncRNAs) for both full sncRNA transcripts and mature sncRNA products derived from these larger RNAs.
200
32%

Information system for G protein-coupled receptors


The GPCRDB is a molecular-class information system that collects, combines, validates and stores large amounts of heterogenous data on G protein-coupled receptors (GPCRs). The GPCRDB contains data on sequences, ligand binding constants and mutations. ...
201
44%

ForestTreeDB


ForestTreeDB is intended as a resource that centralizes large-scale EST sequencing results from several tree species (http://foresttree.org/ftdb). Our group at the Center for Computational Genomics and Bioinformatics (University of Minnesota) aims to ...
202
22%

NEMBASE


Nematode sequence and functional data database
203
22%

alkaligrass


A high-quality genome sequence of alkaligrass provides insights into halophyte stress tolerance. A high-quality chromosome-level genome sequence of alkaligrass assembled from Illumina, PacBio and 10× Genomics reads combined with genome-wide chromosom ...
204
25%

eSLDB - eukaryotic Subcellular Localization database


eSLDB (eukaryotic Subcellular Localization DataBase) collects the annotations of subcellular localization of eukaryotic proteomes. For each sequence, the database lists localization obtained adopting three different approaches: 1) experimentally dete ...
205
22%

Saccharomyces Genome Database


The Saccharomyces Genome Database (SGD) provides comprehensive integrated biological information for the budding yeast Saccharomyces cerevisiae along with search and analysis tools to explore these data, enabling the discovery of functional relations ...
206
54%

RADAR


A Rigorously Annotated Database of A-to-I RNA editing
207
22%

RPFdb


Ribosome profiling database
208
22%

Spliceosome Database


209
22%

eF-site - Electrostatic surface of Functional site


Electrostatic potentials and hydrophobic properties of the active sites
210
22%

Colorectal Cancer Atlas


Colorectral Cancer Atlas is an web-based resource which integrates genomic and proteomic pertaining to colorectal cancer cell lines and tissues. Data catalogued includes, quantitative and non-quantitative protein expression, sequence variations, cell ...
211
29%

Placental Genetic Variance


Includes variations of DNA sequence, chromosomal structure and copy number, as well as RNA and translational variation. The Genetic Variation ontology expands on work done for Variation Ontology (VariO) and Sequence Types and Features Ontology (SO) w ...
212
22%

iPfam


A database of Pfam domain interactions
213
22%

Interrupted coding sequences


ICDS database is a database containing ICDS detected by a similarity-based approach. The definition of each interrupted gene is provided as well as the ICDS genomic localisation with the surrounding sequence.
214
29%

MitoProteome


MitoProteome is a mitochondrial protein sequence database and annotation system. The initial release contains 847 human mitochondrial protein sequences, derived from public sequence databases and mass spectrometric analysis of highly purified human h ...
215
30%

DescribePROT


DescribePROT is a database containing annotations of 13 putative structural and functional properties at the amino acid level for ~1.4 million proteins from 83 popular/model organism, to be extended to hundreds of additional organisms. Users can sear ...
216
26%

Cnidarian Evolutionary Genomics Database


CnidBase, the Cnidarian Evolutionary Genomics Database, is a tool for investigating the evolutionary, developmental and ecological factors that affect gene expression and gene function in cnidarians.
217
25%

CRISPRCasdb


CRISPRCasdb acts as a gateway to a publicly accessible database and software to enable the easy detection of CRISPR sequences in locally-produced data and the consultation of CRISPR sequence data present in the database. It also gives information on ...
218
46%

BPS


Database of RNA Base-Pair Structures
219
22%

Multiple Alignment Format


The Multiple Alignment Format stores DNA level multiple alignments in an easily readable format between entire genomes. Unlike previous formats this resource can cope with forward and reverse strand directions, multiple pieces to the alignment, and s ...
220
24%

TrSDB


Transcription factor database
221
22%

Cacao Genome Database


The Cacao Genome Database (CGD) is a database storing information on the genome of Theobroma cacao. The release of the cacao genome sequence provides researchers with access to the latest genomic tools, enabling more efficient research and accelerati ...
222
24%

SoyBase


SoyBase, the USDA-ARS soybean genetic database, is a comprehensive repository for professionally curated genetics, genomics and related data resources for soybean. SoyBase contains genetic, physical and genomic sequence maps integrated with qualitati ...
223
40%

DESSO-DB


A web database for sequence and shape motif analyses and identification.
224
22%

piRNAclusterDB


Clusters of piRNAs
225
22%

NCBI Genome Data Viewer


The NCBI Genome Data Viewer (GDV) is a genome browser supporting the exploration and analysis of annotated eukaryotic genome assemblies. The GDV browser can visualize different types of molecular data in a whole genome context, including gene annotat ...
226
26%

Therapeutic Structural Antibody Database


The Therapeutic Structural Antibody Database tracks all antibody- and nanobody-related therapeutics recognized by the World Health Organisation (WHO), and identifies any corresponding structures in the Structural Antibody Database (SAbDab) with near- ...
227
31%

ARAMEMNON


ARAMEMNON is a curated database for Arabidopsis thaliana transmembrane (TM) proteins and transporters. The database compiles topology and signal sequence predictions and displays the results in a directly comparable graphical output format for presen ...
228
35%

UNITE database


UNITE is a database and sequence management environment centered on the eukaryotic nuclear ribosomal ITS region. All eukaryotic ITS sequences from the International Nucleotide Sequence Database Collaboration are clustered to approximately the species ...
229
22%

Hollywood


Exon annotation database
230
22%

TOPPR


The Online Protein Processing Resource
231
22%

sRNAMap


small regulatory RNA in microbial genomes
232
22%

CloneDB


Clones and libraries: sequence data, map positions and distributor information
233
22%

CLUSTAL-W Alignment Format


CLUSTAL-W Alignment Format is a simple text-based format, often with a *.aln file extension, used for the input and output of DNA or protein sequences into the Clustal suite of multiple alignment programs.
234
72%

LOX-DB


Due to their involvement in several diseases like cancer, inflammation, fever or arthritis, a lot of research is done on lipoxygenases yielding information about sequence, structure and function of these proteins. The LipOXygenases-DataBase (LOX-DB) ...
235
22%

Expansin Engineering Database


Expansin Engineering Database integrates information on sequence, structure and function of expansins.
236
23%

GABI-Kat SimpleSearch


T-DNA insertions in Arabidopsis and their flanking sequence tags.
237
42%

miRNEST


miRNEST is an integrative collection of animal, plant and virus microRNA data. miRNEST is being gradually developed to create an integrative resource of miRNA-associated data. The data comes from our computational predictions (new miRNAs, targets, mi ...
238
34%

INTERVAL


The INTERVAL bioresource comprises 50,000 English blood donors, on whom deep molecular phenotypes (e.g. genomics, proteomics, metabolomics, lipidomics) have been generated. In over 100 years of blood donation practice, INTERVAL is the first randomise ...
239
23%

Membranome


A database of single-pass membrane proteins
240
22%

Pharmacogenomics Ontology


The PharmGKB Ontology imports genetic sequence data, collected in relational format, into the OWL, and aims to automate the process of updating the links between the ontology and data acquisition when the ontology changes. They have linked PharmGKB w ...
241
31%

SILVA


SILVA is a comprehensive, quality-controlled web resource for up-to-date aligned ribosomal RNA (rRNA) gene sequences from the Bacteria, Archaea and Eukaryota domains alongside supplementary online services. In addition to data products, SILVA provide ...
242
72%

BAliBASE


BAliBASE; a benchmark alignment database, including enhancements for repeats, transmembrane sequences and circular permutations.
243
35%

EbolaID


Provides a complete, quality checked and regularly updated list of oligonucleotides for the Ebola virus. The database describes the genetic diversity across the Ebola genome to facilitate the design of accurate diagnostic methods and therapeutic appr ...
244
24%

CIS-BP


The Catalog of Inferred Sequence Binding Preferences (CIS-BP) is a library of transcription factor (TF) DNA binding motifs and specificities. The data are organized in a user friendly manner for ease of searching, browsing, and downloading. CIS-BP al ...
245
22%

eProS


Energy profiles of protein structures
246
22%

WDSPdb


WD40 domain structure predictions
247
22%

DoBISCUIT


Database Of BIoSynthesis clusters CUrated and InTegrated
248
22%

Molecular Modeling Database


The Molecular Modeling Database (MMDB), as part of the Entrez system, facilitates access to structure data by connecting them with associated literature, protein and nucleic acid sequences, chemicals, biomolecular interactions, and more.
249
39%

PomBase


PomBase is a model organism database that provides organization of and access to scientific data for the fission yeast Schizosaccharomyces pombe. PomBase supports genomic sequence and features, genome-wide datasets and manual literature curation as w ...
250
44%

DBD


DBD provides transcription factor predictions for more than 150 completely sequenced genomes available for browsing and download. Predictions are based on presence of sequence specific DNA binding domain assignments using hidden Markov models from th ...
251
23%

Genome Reviews


The goal of the Genome Reviews project is to provide an up-to-date, standardised and comprehensively annotated view of the genomic sequence of organisms with completely deciphered genomes. Genome Reviews are curated versions of EMBL/GenBank/DDBJ dat ...
252
23%

REDIportal


A-to-I RNA editing events in human
253
22%

SelenoDB


A database of selenoprotein genes, proteins and SECIS elements
254
22%

SomamiR


Somatic mutations that impact microRNA targeting in cancer
255
22%

DAnCER


Disease-Annotated Chromatin Epigenetics Resource
256
22%

National Omics Data Encyclopedia


The National Omics Data Encyclopedia (NODE) is big data library with complete and integrative data storage, safe and efficiency-guaranteed data management as well as comprehensive and user-friendly data service functions. NODE stores raw sequence dat ...
257
23%

Bacterial protein tYrosine Kinase database


The Bacterial protein tYrosine Kinase database (BYKdb) contains computer-annotated BY-kinase sequences. The database web interface allows static and dynamic queries and provides integrated analysis tools including sequence annotation.
258
36%

GlycoCT sequence format for carbohydrates.


GlycoCT format is devised to describe the carbohydrate sequences, with a controlled vocabulary to name monosaccharides, adopting IUPAC rules to generate a consistent, machine-readable nomenclature, based on a connection table approach, instead of a l ...
259
29%

SINEBase


A database of short interspersed elements (SINEs)
260
22%

ChromDB


Chromatin-associated proteins in a broad range of organisms
261
22%

Database of Rice Transcription Factors


DRTF contains 2025 putative transcription factors (TFs) in Oryza sativa L. ssp. indica and 2384 in ssp. japonica, distributed in 63 families, identified by computational prediction and manual curation. It includes detailed annotations of each TF incl ...
262
31%

Factorbook


Human transcription factor binding data from ChIP-seq
263
22%

Annotated regulatory Binding Sites from Orthologous Promoters


ABS: A database of Annotated regulatory Binding Sites from known binding sites identified in promoters of orthologous vertebrate genes.
264
30%

Ebola and Hemorrhagic Fever Virus Database


The Ebola and Hemorrhagic Fever Virus Database stems from the Hemorrhagic Fever Viruses (HFV) Database Project founded by Dr. Carla Kuiken in 2009 at the Los Alamos National Laboratory (LANL). The HFV Database was modeled on the Los Alamos HIV Databa ...
265
29%

POSTAR


Post-transcriptional regulation by RNA-binding proteins
266
22%

UniGene


<<<!!!<<< This repository is no longer available>>>!!!>>>. Although the web pages are no longer available, you will still be able to download the final UniGene builds as static content from the FTP site https://ftp.ncbi.nlm.nih.gov/repository/UniGen ...
267
23%

YM500


smRNA-seq database for miRNA research
268
22%

RAID


Human RNA-RNA and RNA-protein interactions
269
22%

tRNAdb


Compilation of tRNA sequences and tRNA genes
270
22%

COMBREX


Computational Bridge to Experiments
271
29%

L1Base


Functional annotation and prediction of LINE-1 elements
272
22%

ARED-Plus


273
22%

Candida Genome Database


The Candida Genome Database (CGD) provides access to genomic sequence data and manually curated functional information about genes and proteins of the human pathogen Candida albicans. It collects gene names and aliases, and assigns gene ontology term ...
274
38%

EchinoDB


EchinoDB is a database consisting of amino acid sequence othoclusters from 42 echinoderm transcriptomes. We sampled taxa to span the deepest divergences within each of the 5 extant echinoderm classes. Data can be searched by keywords such as annotati ...
275
22%

IMGT/LIGM-DB


IMGT/LIGM-DB is the IMGT® comprehensive database of immunoglobulin (IG) and T cell receptor (TR) nucleotide sequences, from human and other vertebrate species, with translation for fully annotated sequences, created in 1989 by LIGM (http://www.imgt.o ...
276
38%

Databases of Orthologous Promoters


DoOP is a database of eukaryotic promoter sequences (upstream regions), aiming to facilitate the recognition of regulatory sites conserved between species. Based on the Arabidopsis thaliana and Homo sapiens genome annotation, this resource is also a ...
277
28%

LenVarDB


Database of length variantion in protein domains
278
22%

Short Read Archive eXtensible Markup Language


The SRA data model contains the following objects: Study: information about the sequencing project Sample: information about the sequenced samples Experiment: information about the libraries, platform; associated with study, sample(s) and run(s) Run: ...
279
30%

UUCD


Ubiquitin and ubiquitin-like conjugation database
280
22%

ECgene


Genome annotation for alternative splicing
281
22%

AniProtDB


The Animal Proteome Database (AniProtDB) is a comprehensive collection of proteomes from 100 species spanning 21 animal phyla. In addition to providing open access to this collection of high-quality metazoan proteomes, information on predicted protei ...
282
22%

PLPMDB


Pyridoxal-5'-phosphate dependent enzymes mutations
283
22%

eBLOCKS


Classifying proteins into families and super-families allows identification of functionally mportant conserved domains. The motifs and scoring matrices derived from such conserved regions provide computational tools to recognize similar patterns in n ...
284
22%

miRNAMap


microRNA precursors and their mapping to targets in vertebrate genomes
285
22%

MAPPER-2


This resource provides information primarily on the upstream non-coding sequence data of genes in 3 genomes which gives insight into the transcription factors binding sites (TFBSs). For each transcript, the region scanned extends from 10,000bp upstre ...
286
34%

Database resources of the National Center for Biotechnology Information


The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts publish ...
287
54%

RetrOryza


With the availability of the complete genomic sequence of rice, the identification and annotation of LTR-Retrotransposons has become a necessity as they comprise an important part of plant genomes (1). RetrOryza is a database that aims at providing t ...
288
22%

PlantAligDB


Web-based platform of nucleotide sequence alignments of plants.
289
22%

TTSMI


Triplex Target DNA Sites in the human genome
290
22%

Binary Alignment Map Format


BAM is the compressed binary version of the Sequence Alignment/Map (SAM) format, a compact and indexable representation of nucleotide sequence alignments. Many next-generation sequencing and analysis tools work with SAM/BAM. For custom track display, ...
291
33%

MeT-DB


RNA MEthylation by SEquencing databaSe
292
22%

Epitome


Epitome is a database of all known antigenic residues and the antibodies that interact with them, including a detailed description of the residues involved in the interaction and their sequence/structure environments. Each entry in the database descr ...
293
22%

3DBIONOTES


Web based application designed to integrate protein structure, protein sequence and protein annotations in a unique graphical environment. The current version of the application offers a unified, enriched and interactive view of EMDB volumes, PDB str ...
294
22%

China National GeneBank DataBase


The China National GeneBank database (CNGBdb) is a unified platform for biological big data sharing and application services. At present, CNGBdb has integrated a large amount of internal and external biological data from resources such as CNGB, NCBI, ...
295
26%

INTEGRALL


INTEGRALL is a web-based platform dedicated to compile information on integrons and designed to organize all the data available for these genetic structures. INTEGRALL provides a public genetic repository for sequence data and nomenclature and offers ...
296
22%

microRNA.org


microRNA target predictions and expression profiles
297
22%

PPT-DB


Protein Property Prediction and Testing Database
298
22%

ADDA - A Domain Database


ADDA is a global clustering of protein sequences into protein domains and protein domain families. The database currently contains domains for 1.5 Mio sequences from UniProt, ENSEMBL, and other sequence databases. The domains are grouped into 123,000 ...
299
22%

RNA Ontology


RNAO is a controlled vocabulary pertaining to RNA function and based on RNA sequences, secondary and three-dimensional structures. The central aim of the RNA Ontology Consortium (ROC) is to develop an ontology to capture all aspects of RNA - from pri ...
300
34%

Ribonuclease P Database


RNase P sequences, alignments, and structures
301
22%

Generic Feature Format Version 3


The Generic Feature Format Version 3 (GFF3) format was developed after earlier formats, although widely used, became fragmented into multiple incompatible dialects. The GFF3 format addresses the most common extensions to GFF, while preserving backwar ...
302
33%

TIGRFAMs


TIGRFAMs is a collection of manually curated protein families focusing primarily on prokaryotic sequences.It consists of hidden Markov models (HMMs), multiple sequence alignments, Gene Ontology (GO) terminology, Enzyme Commission (EC) numbers, gene s ...
303
40%

MachiBase


Drosophila melanogaster 5' mRNA transcription start site database
304
22%

DoriC


DoriC regions in bacterial and archaeal genomes
305
22%

SNP2TFBS


Regulatory SNPs affecting predicted transcription factor binding sites
306
22%

PALI


The database of Phylogeny and ALIgnment of homologous protein structures (PALI) contains structure-based sequence alignments and dendrograms based on information primarily derived from the structural alignments at domain level [1,2]. Protein domain d ...
307
22%

KIDFamMap


Kinase-inhibitor-disease family map
308
22%

PHYTOPROT


Clusters of predicted plant proteins
309
22%

Ontology for Genetic Interval


Using BFO (Basic Formal Ontology) as its upper-level ontology, the Ontology for Genetic Interval (OGI) represents gene as an entity with its 3D shape, topography, and primary DNA sequence as the foundation for its 3D structure. There is no official h ...
310
23%

ACTIVITY


ACTIVITY, a database on DNA site sequences with known activity magnitudes, measurement systems and sequence-activity relationships under fixed experimental conditions is additionally adapted to applications to the phylogenetic footprints of known sit ...
311
22%

MimoDB


Mimotope database, active site-mimicking peptides selected from phage-display libraries
312
30%

NBDB


NBDB database provides profiles of Elementary Functional Loops (EFLs) involved in binding of nucleotide-containing ligands. Each EFL in form of a PSSM (position-specific scoring matrix) profile is complemented with the information on SCOP entities, s ...
313
22%

SilkDB


The SilkDB is an open-access database for genome biology of the silkworm (Bombyx mori). SilkDB contains the genomic data, including genome assembly, gene annotation, chromosomal mapping, orthologous relationship and experiment data, such as microarra ...
314
31%

LNCediting


RNA editing sites in lncRNAs from human, monkey, mouse and fly
315
22%

Kinomer


Classification of protein kinases encoded in various eukatotic species
316
22%

MegaMotifbase


Structural motifs in protein families and superfamilies
317
22%

Transcription Factor Class


TFClass is a resource that classifies eukaryotic transcription factors (TFs) according to their DNA-binding domains. Combining information from different resources, manually checking the retrieved mammalian TF sequences and applying extensive phyloge ...
318
32%

PyIgClassify


Clusters of conformations of antibody CDRs
319
22%

ZiFDB


Zinc Finger DataBase
320
22%

WERAM


Writers, Erasers and Readers of Histone Acetylation and Methylation
321
22%

NRED


Noncoding RNA Expression Database
322
22%

MALISAM


Manual alignments for structurally analogous motifs in proteins
323
22%

SpliceNest


A tool for visualizing splicing of genes from EST data
324
22%

BeetleBase


Genome database of the beetle Tribolium castaneum
325
33%

Synthetic Gene Database


The Synthetic Gene Database (http://www.evolvingcode.net/codon/sgdb/index.php) is a resource that has collected together sequence information on synthetic genes (i.e. genes that were designed conceptually, rather than built from an initial, physical ...
326
22%

RepTar


Predicted targets of host and viral miRNAs
327
28%

OPTIC


Orthologous and Paralogous Transcripts in Clades
328
22%

JuncDB


Exon-exon Junction database
329
22%

GELBANK


GELBANK is a publicly available database of two-dimensional gel electrophoresis (2DE) gel images of proteomes from organisms with known genome information (available at http://gelbank.anl.gov). GELBANK serves as a database for those proteomics labs t ...
330
27%

The Arabidopsis Information Resource


The Arabidopsis Information Resource (TAIR) maintains a database of genetic and molecular biology data for the model higher plant Arabidopsis thaliana. Data available from TAIR includes the complete genome sequence along with gene structure, gene pro ...
331
50%

RaftProt


Lipid raft associated proteins in mammals
332
22%

Nematodes.org


Wiki for coordinating nematode sequencing projects
333
28%

Ensembl Fungi


Ensembl Fungi is a browser for fungal genomes. A majority of these are taken from the databases of the International Nucleotide Sequence Database Collaboration (the European Nucleotide Archive at the EBI, GenBank at the NCBI, and the DNA Database of ...
334
40%

miRGator


microRNA target prediction, functional analysis, and gene expression data
335
22%

BIOZON


Biozon is a platform that allows for the storage, management, and analysis of interrelated proteins, genes, interactions, protein families, cellular pathways and more. These heterogeneous data types and the relations between them are locally warehous ...
336
22%

OnTheFly


DNA-binding specificities of transcription factors in Drosophila
337
22%

EnteroBase


Global genomic population structure of Clostridioides difficile
338
22%

Cyanolyase


Sequences and motifs of the phycobilin lyase protein family
339
23%

TransportDB


Sequences and classification of predicted membrane transporters encoded in complete genomes
340
22%

Secreted Protein Database


Secreted proteins from human, mouse and rat
341
22%

tRFdb


Short (14-32 nt) tRNA-related fragments
342
22%

CharProtDB


Experimentally Characterized Protein annotations
343
28%

SuperCAT


A database for multilocus sequence typing analysis of the Bacillus cereus group of bacteria
344
22%

Animal Toxin Database


Database of animal toxins
345
22%

*ReputationScore indicates how established a given datasource is. Find out more.



Need help integrating and/or managing biomedical data?