Tag: protein domains and classification


Found 40 sources
Source Match ReputationScore*

Pfam


The Pfam database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs). Pfam also generates higher-level groupings of related entries, known as clans. A clan is a collection of Pf ...
100%

Conserved Domain Database


The Conserved Domain Database (CDD) brings together several collections of multiple sequence alignments representing conserved domains, including NCBI-curated domains, which use 3D-structure information to explicitly to define domain boundaries and p ...
81%

Protein ANalysis THrough Evolutionary Relationships: Classification of Genes and Proteins


The PANTHER (Protein ANalysis THrough Evolutionary Relationships) Classification System is a unique resource that classifies genes by their functions, using published scientific experimental evidence and evolutionary relationships to predict function ...
77%

Simple Modular Architecture Research Tool


SMART (Simple Modular Architecture Research Tool) is a web resource providing simple identification and extensive annotation of protein domains and the exploration of protein domain architectures. It allows the identification and annotation of geneti ...
77%

Evolutionary Genealogy of Genes: Non-supervised Orthologous Groups


eggNOG (evolutionary genealogy of genes: Non-supervised Orthologous Groups) is a database of orthologous groups of genes. The orthologous groups are annotated with functional description lines (derived by identifying a common denominator for the gene ...
63%

InParanoid


The InParanoid database provides a user interface to orthologs inferred by the InParanoid algorithm. InParanoid release 8 is based on the 66 reference proteomes that the 'Quest for Orthologs' community has agreed on using, plus 207 additional proteom ...
59%

ProDom


ProDom is a comprehensive set of protein domain families automatically generated from the UniProt Knowledge Database.
58%

Orthologous MAtrix


The OMA (“Orthologous MAtrix”) project is a method and database for the inference of orthologs among complete genomes. The distinctive features of OMA are its broad scope and size, high quality of inferences, feature-rich web interface, availability ...
56%

Database of Orthologous Groups


OrthoDB presents a catalog of eukaryotic orthologous protein-coding genes. Orthology refers to the last common ancestor of the species under consideration, and thus OrthoDB explicitly delineates orthologs at each radiation along the species phylogeny ...
53%

MobiDB


A database of protein disorder and mobility annotations. MobiDB was designed to offer a centralized resource for annotations of intrinsic protein disorder. The database features three levels of annotation: manually curated, indirect and predicted. Ma ...
53%

TIGRFAMs


TIGRFAMs collates multiple sequence alignments, protein sequence classification using Hidden Markov Models as well as the information that will assist the automated annotation of (mostly prokaryotic) proteins. TIGRFAMs was last updated in 2014.
52%

OrtholugeDB


OrtholugeDB contains Ortholuge-based orthology predictions for completely sequenced bacterial and archaeal genomes. It is also a resource for reciprocal best BLAST-based ortholog predictions, in-paralog predictions (recently duplicated genes) and ort ...
51%

ProtoNet


This resource is a hierarchical clustering of UniProt protein sequences into hierarchical trees. This resource allows for the study of sub-family and super-family of a protein, using UniRef50 clusters.
48%

BAliBASE


BAliBASE; a benchmark alignment database, including enhancements for repeats, transmembrane sequences and circular permutations.
48%

RBPDB


RNA-binding proteins and their specificities
47%

short Open Reading Frame database


sORFs.org is a database for sORFs identified using ribosome profiling. Starting from ribosome profiling, sORFs.org identifies sORFs, incorporates state-of-the-art tools and metrics and stores results in a public database. Two query interfaces are pro ...
46%

PIR SuperFamily


The PIR SuperFamily concept is being used as a guiding principle to provide comprehensive and non-overlapping clustering of UniProtKB sequences into a hierarchical order to reflect their evolutionary relationships.
46%

RNA Binding Protein Variant Database


RBP-Var is a database of functional variants involved in regulation mediated by RNA-binding proteins. Human genome variants can change the RNA structure and affect RNA-protein interactions.
38%

FunShift


Functional divergence between the subfamilies of a protein domain family
38%

SIMAP


Protein sequences are of utmost importance for studying the function and evolution of genes and genomes. Therefore a rich collection of methods in computational biology relies on the analysis and comparison of protein sequences. Many of these intensi ...
37%

Protein Classification Benchmark Collection


The Protein Classification Benchmark Collection was created in order to create standard datasets on which the performance of machine learning methods can be compared.
32%

PANDIT


PANDIT is a collection of multiple sequence alignments and phylogenetic trees covering many common protein domains. It contains the seed protein sequence alignments from the Pfam-A (curated families) database; nucleotide sequence alignments derived f ...
32%

SISYPHUS


The SISYPHUS database contains manually curated multiple structural alignments constructed for a set of proteins with known three-dimensional structures that have revealed non-trivial structural relationships and whose structural similarity is ambigu ...
32%

ADDA - A Domain Database


ADDA is a global clustering of protein sequences into protein domains and protein domain families. The database currently contains domains for 1.5 Mio sequences from UniProt, ENSEMBL, and other sequence databases. The domains are grouped into 123,000 ...
30%

LenVarDB


Database of length variantion in protein domains
30%

InterDom


Putative protein domain interactions
30%

Protein Clusters


Related protein sequences (clusters)of Reference Sequence proteins encoded by complete genomes
30%

PhyloFacts


The PhyloFacts resource contains pre-calculated structural and phylogenomic analysis of over 15,000 protein family "books" across the Tree of Life. Each book includes a multiple sequence alignment, one or more phylogenetic trees, predicted subfamilie ...
30%

iPfam


A database of Pfam domain interactions
30%

MulPSSM


Representation of multiple sequence alignments of protein families in terms of Position Specific Scoring Matrices (PSSMs) is commonly used in the detection of remote homologues. A PSSM is generated with respect to one of the sequences involved in the ...
30%

OPTIC


Orthologous and Paralogous Transcripts in Clades
30%

iProClass


The iProClass database provides value-added information reports for UniProtKB and unique NCBI Entrez protein sequences in UniParc, with links to over 175 biological databases, including databases for protein families, functions and pathways, interact ...
30%

Hits


High throughput genome (HTG) and expressed sequence tag (EST) sequences are currently the most abundant nucleotide sequence classes in the public database. The large volume, high degree of fragmentation and lack of gene structure annotations prevent ...
30%

PALI


The database of Phylogeny and ALIgnment of homologous protein structures (PALI) contains structure-based sequence alignments and dendrograms based on information primarily derived from the structural alignments at domain level [1,2]. Protein domain d ...
30%

BIOZON


Biozon is a platform that allows for the storage, management, and analysis of interrelated proteins, genes, interactions, protein families, cellular pathways and more. These heterogeneous data types and the relations between them are locally warehous ...
30%

DomIns - Database of Domain Insertions


Proteins can be formed by single or multiple domains. The process of recombination at the molecular level has generated a wide variety of multi-domain proteins with specific domain organization to cater to the functional requirements of an organism. ...
30%

EVEREST - EVolutionary Ensembles of REcurrent SegmenTs


EVEREST is an automatic computational process identifying protein domainsand classifying them into families. The EVEREST database contains 20,029families, each defined by one or more HMMER HMMs. EVEREST has beenthoroughly tested and evaluated, and ha ...
30%

3DSwap: Database of Proteins involved in 3D domain Swapping


Protein oligomerization is a key biochemical step to perform the designated function of proteins. 3D domain swapping is a unique protein oligomerization phenomenon observed in a wide array of proteins involved in diverse functional roles. Apart from ...
30%

SBASE


SBASE (http://www.icgeb.trieste.it/sbase) is an on-line collection of protein domain sequences and related computational tools designed to facilitate detection of domain homologies based on simple database search. The tenth - "jubilee release" of the ...
30%

SUPFAM


During the course of evolution, protein sequences derived from a common ancestor diverge by mutations, insertions and deletions, gene duplication and recombination and give rise to diverse families with no easily detectable sequence similarity. These ...
30%

*ReputationScore indicates how established a given datasource is. Find out more.




Need help integrating and/or managing biomedical data?