Tag: polypeptide region

Found 66 sources
Source Match ReputationScore*

UniProt Knowledgebase

Universal Protein resource. A database of protein sequence and functional information, many entries being derived from genome sequencing projects. It contains a large amount of information about the biological function of proteins derived from the re ...


The Pfam database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs). Pfam also generates higher-level groupings of related entries, known as clans. A clan is a collection of Pf ...

CLUSTAL-W Alignment Format

CLUSTAL-W Alignment Format is a simple text-based format, often with a *.aln file extension, used for the input and output of DNA or protein sequences into the Clustal suite of multiple alignment programs.

Integrated resource of protein families, domains and functional sites

InterPro is a resource that provides functional analysis of protein sequences by classifying them into families and predicting the presence of domains and important sites. To classify proteins in this way, InterPro uses predictive models, known as si ...

Protein ANalysis THrough Evolutionary Relationships: Classification of Genes and Proteins

The PANTHER (Protein ANalysis THrough Evolutionary Relationships) Classification System is a unique resource that classifies genes by their functions, using published scientific experimental evidence and evolutionary relationships to predict function ...

FASTA Sequence Format

FASTA format is a text-based format for representing either nucleotide sequences or peptide sequences, in which nucleotides or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede th ...

Structural Classification Of Proteins

The SCOP database is a curated both manually and with the use of automated tools. This freely available resource aims to provide a comprehensive description of the structural and evolutionary relationships between all proteins whose structure is know ...


The MEROPS database is an information resource for peptidases (also termed proteases, proteinases and proteolytic enzymes) and the proteins that inhibit them.

European Nucleotide Archive

The European Nucleotide Archive (ENA) is a globally comprehensive data resource for nucleotide sequence, spanning raw data, alignments and assemblies, functional and taxonomic annotation and rich contextual data relating to sequenced samples and expe ...

Gramene: A curated, open-source, integrated data resource for comparative functional genomics in plants

Gramene's purpose is to provide added value to plant genomics data sets available within the public sector, which will facilitate researchers' ability to understand the plant genomes and take advantage of genomic sequence known in one species for ide ...

Comprehensive Antibiotic Resistance Database

A bioinformatic database of antimicrobial resistance genes, their products and associated phenotypes.


ConoServer is a database specializing in sequences and structures of peptides expressed by marine cone snails. The database gives access to protein sequences, nucleic acid sequences and structural information on conopeptides. ConoServer's data are fi ...

Clusters of Orthologous Groups (COG) Analysis Ontology

CAO ontology is designed for supporting the COG enrichment study by using Fisher's exact test. It is used for the ontology based application for statistical analysis on COG db.

FASTQ Sequence and Sequence Quality Format

FASTQ is a text-based file format for sharing sequencing data combining both the sequence and an associated per base quality score.

The Protein Database

The Entrez Protein search and retrieval system contains protein entries that have been compiled from a variety of sources, including SwissProt, PIR, PRF, PDB, and translations from annotated coding regions in GenBank and RefSeq.

TTD, Therapeutic Target Database

The Therapeutic Target Database provides information about therapeutic protein and nucleic acid targets, the targeted disease, pathway information and the corresponding drugs directed at each of these targets. Also included in this database are links ...


PLAZA is a platform for comparative, evolutionary, and functional genomics. The platform consists of multiple instances, where each instance contains additional genomes, improved genome annotations, new software tools, etc.

Database of Bacterial Exotoxins for Human

DBETH is the Database of Bacterial Exotoxins for Human. The aim of this database is to assemble information on the toxins responsible for causing bacterial pathogenesis in humans.

European Hepatitis C Virus database

The euHCVdb is mainly oriented towards protein sequence, structure and function analyses and structural biology of Hepatitis C Virus.

Group II introns database

Database for identification and cataloguing of group II introns. All bacterial introns listed are full-length and appear to be functional, based on intron RNA and IEP characteristics. The database names the full-length introns, and provides informati ...

DNA Data Bank of Japan

An annotated collection of all publicly available nucleotide and protein sequences. DDBJ collects sequence data mainly from Japanese researchers, as well as researchers in other countries. DDBJ is part of the International Nucleotide Sequence Databas ...


Gene3D uses the information in CATH to predict the locations of structural domains on millions of protein sequences available in public databases. Sequence data from UniProtKB and Ensembl for domains with no experimentally determined structures are s ...

Complex Portal

The Complex Portal is a manually curated, encyclopaedic resource of macromolecular complexes from a number of key model organisms. The majority of complexes are made up of proteins but may also include nucleic acids or small molecules. All data is fr ...

Termini-Oriented Protein Function INferred Database

The Termini-Oriented Protein Function INferred Database (TopFIND) is an integrated knowledgebase focused on protein termini, their formation by proteases and functional implications. It contains information about the processing and the processing sta ...

COVID-19 Data Portal

The COVID-19 Data Portal enables researchers to upload, access and analyse COVID-19 related reference data and specialist datasets. The aim of the COVID-19 Data Portal is to facilitate data sharing and analysis, and to accelerate coronavirus research ...


ProtClustDB is a collection of related protein sequences (clusters) consisting of Reference Sequence proteins encoded by complete genomes. This database contains both curated and non-curated clusters.

HAMAP database of microbial protein families

HAMAP is a system, based on manual protein annotation, that identifies and semi-automatically annotates proteins that are part of well-conserved families or subfamilies: the HAMAP families. HAMAP is based on manually created family rules and is appli ...


SNPeffect is a database for phenotyping human single nucleotide polymorphisms (SNPs). SNPeffect primarily focuses on the molecular characterization and annotation of disease and polymorphism variants in the human proteome. Further, SNPeffect holds pe ...

HMMER Profile File Format

The profile hidden Markov Model (HMM) calculated from multiple sequence alignment data in this service is stored in Profile HMM save format (usually with ".hmm" extension). It is an ASCII file containing a lot of header and descriptive records follow ...

Bacterial protein tYrosine Kinase database

The Bacterial protein tYrosine Kinase database (BYKdb) contains computer-annotated BY-kinase sequences. The database web interface allows static and dynamic queries and provides integrated analysis tools including sequence annotation.


BAliBASE; a benchmark alignment database, including enhancements for repeats, transmembrane sequences and circular permutations.

Human Histone Database

HIstome (Human histone database) is a freely available, specialist, electronic database dedicated to display information about human histone variants, sites of their post-translational modifications and about various histone modifying enzymes.


PathBank is an interactive, visual database containing more than 100 000 machine-readable pathways found in model organisms such as humans, mice, E. coli, yeast, and Arabidopsis thaliana. The majority of these pathways are not found in any other path ...

Pocketome: an encyclopedia of small-molecule binding sites in 4D

The Pocketome is an encyclopedia of conformational ensembles of druggable binding sites that can be identified experimentally from co-crystal structures in the Protein Data Bank. Each Pocketome entry describes a site on a protein surface that is invo ...


Genome3D is a resource that provides structural annotation and 3D models of genomes of model organisms such as human, yeast and E.coli. The database can be used to predict protein structures that have not yet been identified. Genome3D uses structural ...

Minimum Information About Sample Preparation for a Phosphoproteomics Experiment

Please note: We cannot find an up-to-date website or official reporting guideline document for this resource. As such, we have marked it as Uncertain. Please contact us if you have any information on the current status of this resource.


WALTZ-DB 2.0 is a database for characterizing short peptides for their amyloid fiber-forming capacities. The majority of the data comes from electron microscopy, FTIR and Thioflavin-T experiments done by the Switch lab. Apart from that class of data ...

Proteome-pI : proteome isoelectric point database

Proteome-pI is an online database containing information about predicted isoelectric points for 5,029 proteomes (21 million of sequences) calculated using 18 methods. The isoelectric point, the pH at which a particular molecule carries no net electri ...


BacMap is a picture atlas of annotated bacterial genomes. It is an interactive visual database containing hundreds of fully labeled, zoomable, and searchable maps of bacterial genomes.

The Yeast Metabolome DataBase

The Yeast Metabolome Database (YMDB) is a manually curated database of small molecule metabolites found in or produced by Saccharomyces cerevisiae (also known as Baker’s yeast and Brewer’s yeast). This database covers metabolites described in textboo ...


Families of nuclear hormone receptors


PDB-REPRDB is a reorganized database of protein chains from PDB(Protein Data Bank), and provides 'the list of the representative protein chains' and 'the list of similar protein chain groups'.

Evolutionary Annotation Database

Evola contains ortholog information of all human genes among vertebrates. Orthologs are a pair of genes in different species that evolved from a common ancestral gene by speciation. In Evola, orthologs were detected by comparative genomics and amino ...


KinMutBase is a comprehensive database of disease-causing mutations in protein kinase domains. This resources provides plenty of information, namely mutation statistics and display, clickable sequences with mutations and changes to restriction enzyme ...


MitoProteome is a mitochondrial protein sequence database and annotation system. The initial release contains 847 human mitochondrial protein sequences, derived from public sequence databases and mass spectrometric analysis of highly purified human h ...


The Proteomics Informatics Working Group is developing standards for describing the results of identification and quantification processes for proteins, peptides, small molecules and protein modifications from mass spectrometry. This working group is ...

Indel Flanking Region Database

Indel Flanking Region Database is an online resource for indels (insertion/deletions) and the flanking regions of proteins in SCOP superfamilies. It aims at providing a comprehensive dataset for analyzing the qualities of amino acid indels, substitut ...


An InterMine interface to data from Phytozome


Detection of functional divergence in human protein families. Cube-DB is a database of pre-evaluated conservation and specialization scores for residues in paralogous proteins belonging to multi-member families of human proteins. Protein family class ...

Sequence-Structural Templates of Single-member Superfamilies

SSToSS is a database which provides sequence-structural templates of single member protein domain superfamilies like PASS2. Sequence-structural templates are recognized by considering the content and overlap of sequence similarity and structural para ...


HypoxiaDB is a manually-curated non-redundant catalogue of human hypoxia-regulated proteins with a goal of collecting proteins whose expression patterns are altered in hypoxic conditions.

Evolutionary Trace

Relative evolutionary importance of amino acids within a protein sequence.

ABCD database

The ABCD (AntiBodies Chemically Defined) database is a manually curated depository of sequenced antibodies.


A database for systems biology of DNA dynamics during the cell life.


Developed by the Consortium for Top-Down Proteomics, ProForma is a standardized notation for writing fully characterized proteoforms. A proteoform is a specific set of amino acids arranged in a particular order, which may be further modified (cotrans ...

Silkworm Pathogen Database

Silkworm Pathogen Database (SilkPathDB) is a comprehensive resource for studying on pathogens of silkworm, including microsporidia, fungi, bacteria and virus. SilkPathDB provides access to not only genomic data including functional annotation of gene ...

BCL-2 Database

BCL2DB is a database designed to integrate data on BCL-2 family members and BH3-only proteins.

IPD-NHKIR - Non-Human Killer-cell Immunoglobulin-like Receptors

The IPD-NHKIR database provides a centralised repository for non-human KIR (NHKIR) sequences. Killer-cell Immunoglobulin-like Receptors (KIR) have been shown to be highly polymorphic at the allelic and haplotypic level. KIRs are members of the immuno ...

Conformation Angles Database

Conformation Angles DataBase [ CADB-3.0 ] is a comprehensive, authoritative and timely knowledge base developed to facilitate retrieval of information related to the conformational angles (main-chain and side-chain) of the amino acid residues present ...

Cnidarian Evolutionary Genomics Database

CnidBase, the Cnidarian Evolutionary Genomics Database, is a tool for investigating the evolutionary, developmental and ecological factors that affect gene expression and gene function in cnidarians.

Vertebrate Secretome Database

Vertebrate Secretome Database (VerSeDa) stores information about proteins that are predicted to be secreted through the classical and non-classical mechanisms, for the wide range of vertebrate species deposited at the NCBI, UCSC and ENSEMBL sites.


A world bioinformatic public service for high-speed access to up-to-date DNA & protein biological sequence databanks.

Amino Acid Ontology

Amino Acid Ontology is an ontology of amino acids and their properties. It captures how biochemists talk about amino acids; that is, it is a conceptualisation of amino acids. This ontology smoothens out the way in which amino acids are described as t ...

BpForms Grammar

The BpForms Grammar extends the IUPAC/IUBMB notation commonly used to represent unmodified DNA, RNA, and proteins to describe non-canonical forms of DNA, RNA, and proteins. Features include the representation of a wider range of monomeric forms, incl ...

BcForms Grammar

The BcForms represents complexes as a sets of subunits, including their stoichiometries, and a set of interchain/intersubunit crosslinks. Furthermore, BcForms can be combined with BpForms and SMILES descriptions of subunits to calculate properties of ...

IUPAC-IUB Joint Commission on Biochemical Nomenclature - Nomenclature and Symbolism for Amino Acids and Peptides

The Nomenclature and Symbolism for Amino Acids and Peptides, created by the IUPAC-IUB Joint Commission on Biochemical Nomenclature, formalizes the naming scheme for amino acids, non-peptide derivatives of amino acids and peptides as well as peptide d ...

*ReputationScore indicates how established a given datasource is. Find out more.

Need help integrating and/or managing biomedical data?