Tag: sequence annotation

Found 31 sources
Source Match ReputationScore*

Gene Ontology

The Gene Ontology resource provides a computational representation of our current scientific knowledge about the functions of genes (or, more properly, the protein and non-coding RNA molecules produced by genes) from many different organisms, from hu ...

UniProt Knowledgebase

Universal Protein resource. A database of protein sequence and functional information, many entries being derived from genome sequencing projects. It contains a large amount of information about the biological function of proteins derived from the re ...

UCSC Genome Browser database

Genome assemblies and aligned annotations for a wide range of vertebrates and model organisms, along with an integrated tool set for visualizing, comparing, analyzing and sharing both publicly available and user-generated genomic datasets.

The UCSC Archaeal Genome Browser

The UCSC Archaeal Genome Browser is a window on the biology of more than 100 microbial species from the domain Archaea. Basic gene annotation is derived from NCBI Genbank/RefSeq entries, with overlays of sequence conservation across multiple species, ...

Gramene: A curated, open-source, integrated data resource for comparative functional genomics in plants

Gramene's purpose is to provide added value to plant genomics data sets available within the public sector, which will facilitate researchers' ability to understand the plant genomes and take advantage of genomic sequence known in one species for ide ...

Sequence Ontology

SO is a collaborative ontology project for the definition of sequence features used in biological sequence annotation. The Sequence Ontology is a set of terms and relationships used to describe the features and attributes of biological sequence. SO i ...

European Nucleotide Archive

The European Nucleotide Archive (ENA) is a globally comprehensive data resource for nucleotide sequence, spanning raw data, alignments and assemblies, functional and taxonomic annotation and rich contextual data relating to sequenced samples and expe ...


EBI Metagenomics has changed its name to MGnify to reflect a change in scope. This is a free-to-use resource aiming at supporting all metagenomics researchers. The service is an automated pipeline for the analysis and archiving of metagenomic data th ...

Synthetic Biology Open Language

The Synthetic Biology Open Language (SBOL) is a standard used for the in silico representation of genetic designs. SBOL is designed to allow synthetic biologists and genetic engineers to electronically exchange designs, send and receive genetic desig ...

CAPS-DB : a structural classification of helix-capping motifs

CAPS-DB is a structural classification of helix-cappings or caps compiled from protein structures. Caps extracted from protein structures have been structurally classified based on geometry and conformation and organized in a tree-like hierarchical c ...

Universal PBM Resource for Oligonucleotide Binding Evaluation

The UniPROBE (Universal PBM Resource for Oligonucleotide Binding Evaluation) database hosts data generated by universal protein binding microarray (PBM) technology on the in vitro DNA binding specificities of proteins.

DNA Data Bank of Japan

An annotated collection of all publicly available nucleotide and protein sequences. DDBJ collects sequence data mainly from Japanese researchers, as well as researchers in any other countries. DDBJ is part of the International Nucleotide Sequence Dat ...


neXtProt is a comprehensive human-centric discovery platform, offering its users a seamless integration of and navigation through protein-related data.

Genome Properties

Genome properties is an annotation system whereby functional attributes can be assigned to a genome, based on the presence of a defined set of protein signatures within that genome. This is a reimplementation at EMBL-EBI of a resource previously host ...

Termini-Oriented Protein Function INferred Database

The Termini-Oriented Protein Function INferred Database (TopFIND) is an integrated knowledgebase focused on protein termini, their formation by proteases and functional implications. It contains information about the processing and the processing sta ...

HAMAP database of microbial protein families

HAMAP is a system, based on manual protein annotation, that identifies and semi-automatically annotates proteins that are part of well-conserved families or subfamilies: the HAMAP families. HAMAP is based on manually created family rules and is appli ...


AmoebaDB belongs to the EuPathDB family of databases and is an integrated genomic and functional genomic database for Entamoeba and Acanthamoeba parasites. In its first released, AmoebaDB contained the genomes of three Entamoeba species. AmoebaDB int ...


RNAcentral is a free, public resource that offers integrated access to a comprehensive and up-to-date set of non-coding RNA sequences provided by a collaborating group of databases representing a broad range of organisms and RNA types.

Gene Wiki

The goal of the Gene Wiki is to apply community intelligence to the annotation of gene and protein function. The Gene Wiki is an informal collection of pages on human genes and proteins, and this effort to develop these pages is tightly coordinated w ...


MirGeneDB is a database of microRNA genes that have been validated and annotated as described in "A Uniform System for the Annotation of Vertebrate microRNA Genes and the Evolution of the Human microRNAome".* The initial version contained 1,434 micro ...

Target Central Resource Database

TCRD is the central resource behind the Illuminating the Druggable Genome Knowledge Management Center (IDG-KMC). TCRD contains information about human targets, with special emphasis on four families of targets that are central to the NIH IDG initiati ...


Peroxibase provides access to peroxidase sequences from all kingdoms of life, and provides a series of bioinformatics tools and facilities suitable for analysing these sequences.

Database of small human non-coding RNAs

Integrated annotation and sequencing-based expression data for all major classes of human small non-coding RNAs (sncRNAs) for both full sncRNA transcripts and mature sncRNA products derived from these larger RNAs.


Families of nuclear hormone receptors

Description of Plant Viruses

DPVweb provides a central source of information about viruses, viroids and satellites of plants, fungi and protozoa. Comprehensive taxonomic information, including brief descriptions of each family and genus, and classified lists of virus sequences a ...

Feature Annotation Location Description Ontology

The Feature Annotation Location Description Ontology (FALDO), to describe the positions of annotated features on linear and circular sequences for data resources represented in RDF and/or OWL. FALDO can be used to describe nucleotide features in sequ ...

Medaka Expression Pattern Database

MEPD contains expression data of genes and regulatory elements assayed in the Japanese killifish Medaka (Oryzias latipes).

Domain-centric GO

Domain-centric GO provides associations between ontological terms and protein domains at the superfamily and family levels. Some functional units consist of more than one domain acting together or acting at an interface between domains; therefore, on ...

Berkeley Drosophila Genome Project EST database

The goals of the Drosophila Genome Center are to finish the sequence of the euchromatic genome of Drosophila melanogaster to high quality and to generate and maintain biological annotations of this sequence.

DDBJ Sequence Read Archive

DDBJ Sequence Read Archive (DRA) is an archive database for output data generated by next-generation sequencing machines including Roche 454 GS System®, Illumina Genome Analyzer®, Applied Biosystems SOLiD® System, and others. DRA is a member of the I ...

DDBJ Trace Archive

DDBJ Trace Archive (DTA) is a permanent repository of DNA sequence chromatograms (traces), base calls, and quality estimates for single-pass reads from various large-scale sequencing projects. DTA is a member of the International Nucleotide Sequence ...

*ReputationScore indicates how established a given datasource is. Find out more.

Need help integrating and/or managing biomedical data?