Tag: protein domains and classification

Found 40 sources

Source	Match	ReputationScore*
Pfam The Pfam database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs). Pfam also generates higher-level groupings of related entries, known as clans. A clan is a collection of Pf ...		100%
Conserved Domain Database The Conserved Domain Database (CDD) brings together several collections of multiple sequence alignments representing conserved domains, including NCBI-curated domains, which use 3D-structure information to explicitly to define domain boundaries and p ...		84%
Simple Modular Architecture Research Tool SMART (Simple Modular Architecture Research Tool) is a web resource providing simple identification and extensive annotation of protein domains and the exploration of protein domain architectures. It allows the identification and annotation of geneti ...		80%
Protein ANalysis THrough Evolutionary Relationships: Classification of Genes and Proteins The PANTHER (Protein ANalysis THrough Evolutionary Relationships) Classification System is a unique resource that classifies genes by their functions, using published scientific experimental evidence and evolutionary relationships to predict function ...		76%
Evolutionary Genealogy of Genes: Non-supervised Orthologous Groups eggNOG (evolutionary genealogy of genes: Non-supervised Orthologous Groups) is a database of orthologous groups of genes. The orthologous groups are annotated with functional description lines (derived by identifying a common denominator for the gene ...		65%
InParanoid The InParanoid database provides a user interface to orthologs inferred by the InParanoid algorithm. InParanoid release 8 is based on the 66 reference proteomes that the 'Quest for Orthologs' community has agreed on using, plus 207 additional proteom ...		58%
ProDom ProDom is a comprehensive set of protein domain families automatically generated from the UniProt Knowledge Database.		56%
Orthologous MAtrix The OMA (“Orthologous MAtrix”) project is a method and database for the inference of orthologs among complete genomes. The distinctive features of OMA are its broad scope and size, high quality of inferences, feature-rich web interface, availability ...		54%
MobiDB MobiDB is a database of intrinsically disordered regions (IDRs) and related features from various sources and prediction tools. Different levels of reliability and different features are reported as different and independent annotations. The database ...		53%
TIGRFAMs TIGRFAMs is a collection of manually curated protein families focusing primarily on prokaryotic sequences.It consists of hidden Markov models (HMMs), multiple sequence alignments, Gene Ontology (GO) terminology, Enzyme Commission (EC) numbers, gene s ...		52%
Database of Orthologous Groups OrthoDB presents a catalog of eukaryotic orthologous protein-coding genes. Orthology refers to the last common ancestor of the species under consideration, and thus OrthoDB explicitly delineates orthologs at each radiation along the species phylogeny ...		51%
OrtholugeDB OrtholugeDB contains Ortholuge-based orthology predictions for completely sequenced bacterial and archaeal genomes. It is also a resource for reciprocal best BLAST-based ortholog predictions, in-paralog predictions (recently duplicated genes) and ort ...		49%
BAliBASE BAliBASE; a benchmark alignment database, including enhancements for repeats, transmembrane sequences and circular permutations.		46%
ProtoNet This resource is a hierarchical clustering of UniProt protein sequences into hierarchical trees. This resource allows for the study of sub-family and super-family of a protein, using UniRef50 clusters.		46%
RBPDB RNA-binding proteins and their specificities		45%
short Open Reading Frame database sORFs.org is a database for sORFs identified using ribosome profiling. Starting from ribosome profiling, sORFs.org identifies sORFs, incorporates state-of-the-art tools and metrics and stores results in a public database. Two query interfaces are pro ...		45%
PIR SuperFamily The PIR SuperFamily concept is being used as a guiding principle to provide comprehensive and non-overlapping clustering of UniProtKB sequences into a hierarchical order to reflect their evolutionary relationships.		44%
RNA Binding Protein Variant Database RBP-Var is a database of functional variants involved in regulation mediated by RNA-binding proteins. Human genome variants can change the RNA structure and affect RNA-protein interactions.		37%
FunShift Functional divergence between the subfamilies of a protein domain family		37%
SIMAP Protein sequences are of utmost importance for studying the function and evolution of genes and genomes. Therefore a rich collection of methods in computational biology relies on the analysis and comparison of protein sequences. Many of these intensi ...		36%
Protein Classification Benchmark Collection The Protein Classification Benchmark Collection was created in order to create standard datasets on which the performance of machine learning methods can be compared.		30%
PANDIT PANDIT is a collection of multiple sequence alignments and phylogenetic trees covering many common protein domains. It contains the seed protein sequence alignments from the Pfam-A (curated families) database; nucleotide sequence alignments derived f ...		30%
SISYPHUS The SISYPHUS database contains manually curated multiple structural alignments constructed for a set of proteins with known three-dimensional structures that have revealed non-trivial structural relationships and whose structural similarity is ambigu ...		30%
ADDA - A Domain Database ADDA is a global clustering of protein sequences into protein domains and protein domain families. The database currently contains domains for 1.5 Mio sequences from UniProt, ENSEMBL, and other sequence databases. The domains are grouped into 123,000 ...		28%
LenVarDB Database of length variantion in protein domains		28%
InterDom Putative protein domain interactions		28%
Protein Clusters Related protein sequences (clusters)of Reference Sequence proteins encoded by complete genomes		28%
PhyloFacts The PhyloFacts resource contains pre-calculated structural and phylogenomic analysis of over 15,000 protein family "books" across the Tree of Life. Each book includes a multiple sequence alignment, one or more phylogenetic trees, predicted subfamilie ...		28%
iPfam A database of Pfam domain interactions		28%
MulPSSM Representation of multiple sequence alignments of protein families in terms of Position Specific Scoring Matrices (PSSMs) is commonly used in the detection of remote homologues. A PSSM is generated with respect to one of the sequences involved in the ...		28%
OPTIC Orthologous and Paralogous Transcripts in Clades		28%
iProClass The iProClass database provides value-added information reports for UniProtKB and unique NCBI Entrez protein sequences in UniParc, with links to over 175 biological databases, including databases for protein families, functions and pathways, interact ...		28%
Hits High throughput genome (HTG) and expressed sequence tag (EST) sequences are currently the most abundant nucleotide sequence classes in the public database. The large volume, high degree of fragmentation and lack of gene structure annotations prevent ...		28%
PALI The database of Phylogeny and ALIgnment of homologous protein structures (PALI) contains structure-based sequence alignments and dendrograms based on information primarily derived from the structural alignments at domain level [1,2]. Protein domain d ...		28%
BIOZON Biozon is a platform that allows for the storage, management, and analysis of interrelated proteins, genes, interactions, protein families, cellular pathways and more. These heterogeneous data types and the relations between them are locally warehous ...		28%
DomIns - Database of Domain Insertions Proteins can be formed by single or multiple domains. The process of recombination at the molecular level has generated a wide variety of multi-domain proteins with specific domain organization to cater to the functional requirements of an organism. ...		28%
3DSwap: Database of Proteins involved in 3D domain Swapping Protein oligomerization is a key biochemical step to perform the designated function of proteins. 3D domain swapping is a unique protein oligomerization phenomenon observed in a wide array of proteins involved in diverse functional roles. Apart from ...		28%
EVEREST - EVolutionary Ensembles of REcurrent SegmenTs EVEREST is an automatic computational process identifying protein domainsand classifying them into families. The EVEREST database contains 20,029families, each defined by one or more HMMER HMMs. EVEREST has beenthoroughly tested and evaluated, and ha ...		28%
SBASE SBASE (http://www.icgeb.trieste.it/sbase) is an on-line collection of protein domain sequences and related computational tools designed to facilitate detection of domain homologies based on simple database search. The tenth - "jubilee release" of the ...		28%
SUPFAM During the course of evolution, protein sequences derived from a common ancestor diverge by mutations, insertions and deletions, gene duplication and recombination and give rise to diverse families with no easily detectable sequence similarity. These ...		28%

*ReputationScore indicates how established a given datasource is. Find out more.

Need help integrating and/or managing biomedical data?