UniProt Knowledgebase

Universal Protein resource. A database of protein sequence and functional information, many entries being derived from genome sequencing projects. It contains a large amount of information about the biological function of proteins derived from the re ...

Human Protein Atlas

The Human Protein Atlas is program started with the aim to map of all the human proteins in cells, tissues and organs using integration of various omics technologies. It consists of three parts: Tissue Atlas showing the distribution of proteins acros ...

Chemical Entities of Biological Interest

Chemical Entities of Biological Interest (ChEBI) is a free dictionary that describes 'small’ chemical compounds. These compound includes distinct synthetic or natural atoms, molecules, ions, ion pair, radicals, radical ions, complexes, conformers, et ...

SWISS-MODEL Repository of 3D protein structure models

The SWISS-MODEL Repository is a database of annotated 3D protein structure models generated by the SWISS-MODEL homology-modelling pipeline for protein sequences of selected model organisms.

The Cambridge Structural Database

Established in 1965, the Cambridge Structural Database (CSD) is the a repository for small-molecule organic and metal-organic crystal 3D structures. Database records are automatically checked and manually curated by one of our expert in-house scienti ...


The MEROPS database is an information resource for peptidases (also termed proteases, proteinases and proteolytic enzymes) and the proteins that inhibit them.


VectorBase is a web-accessible data repository for information about invertebrate vectors of human pathogens. VectorBase annotates and maintains vector genomes (as well as a number of non-vector genomes for comparative analysis) providing an integrat ...


PROSITE is a database of protein families and domains. PROSITE consists of documentation entries describing protein domains, families and functional sites as well as associated patterns and profiles to identify them.


The Entrez Global Query Cross-Database Search System is a federated search engine, or web portal that allows users to search many discrete health sciences databases at the National Center for Biotechnology Information (NCBI) website. Entrez can effic ...


The PeptideAtlas Project provides a publicly-accessible database of peptides identified in tandem mass spectrometry proteomics studies and software tools. Mass spectrometer output files are collected for human, mouse, yeast, and several other organis ...

The Arabidopsis Information Resource

The Arabidopsis Information Resource (TAIR) maintains a database of genetic and molecular biology data for the model higher plant Arabidopsis thaliana. Data available from TAIR includes the complete genome sequence along with gene structure, gene pro ...


NeuroMorpho.Org is a centrally curated inventory of 3D digitally reconstructed neurons associated with peer-reviewed publications. The goal of NeuroMorpho.Org is to provide dense coverage of available reconstruction data for the neuroscience communit ...

Virus Particle Explorer

VIPERdb is a database for icosahedral virus capsid structures. The emphasis is on providing data from structural and computational analyses on these systems, as well as high quality renderings for visual exploration.

TDR Targets

TDR Targets integrates chemical and genomic information and allows users to prioritize targets and compounds to develop and repurpose new drugs and chemical tools for human pathogens. The TDR Target Project was started in 2005 after a call for applic ...

IUPAC International Chemical Identifier

Originally developed by the International Union of Pure and Applied Chemistry (IUPAC), the IUPAC International Chemical Identifier (InChI) is a machine-readable string generated from a chemical structure. InChIs are unique to the compound they descri ...

Protein Data Bank Format

An exchange format for reporting experimentally determined three-dimensional structures of biological macromolecules that serves a global community of researchers, educators, and students. The data contained in the archive include atomic coordinates, ...

Protein Data Bank Japan

The Protein Data Bank is the single worldwide archive of structural data of biological macromolecules.

NMR Self-defining Text Archive and Retrieval format

NMR-STAR is an extension of the STAR file format to store the results of biological NMR experiments.

Nuclear Magnetic Resonance Controlled Vocabulary

nmrCV is a MSI-sanctioned NMR controlled vocabulary, created within the COSMOS EU project, to support the nmrML data standard for nuclear magnetic resonance data in metabolomics with standardized meaningful data descriptors. This CV is the successor ...


MobiDB is a database of intrinsically disordered regions (IDRs) and related features from various sources and prediction tools. Different levels of reliability and different features are reported as different and independent annotations. The database ...

Crystallography Open Database

The Crystallography Open Database (COD) is a project that aims to gather all available inorganic, metal-organic and small organic molecule structural data in one database.

Protein Circular Dichroism Data Bank

The Protein Circular Dichroism Data Bank (PCDDB) is an open-access online repository for protein circular dichroism spectral- and meta-data. Users may freely extract and deposit validated data and the validation process is conveniently integrated int ...

Model Archive

The Model Archive provides a stable archive for computational macro-molecular models published in the scientific literature. The model archive provides a unique stable accession code (DOI) for each deposited model, which can be directly referenced in ...


The international glycan structure repository for glycans published in the literature. Any glycan structure, ranging in resolution from monosaccharide composition to fully defined structures can be registered and have an accession number assigned as ...

Structural Biology Data Grid

The Structural Biology Data Grid (SBGrid-DG) community-driven repository to preserve primary experimental datasets that support scientific publications. The SBGrid Data Bank is an open source research data management system enabling Structural Biolog ...

ImMunoGeneTics Ontology

IMGT-ONTOLOGY provides a specific vocabulary of the terminology to be used in immunogenetics and immunoinformatics. The ontology allows for the standardization for immunogenetics data from genome, proteome, genetics, two-dimensional and three-dimensi ...

CHEMical INFormation Ontology

The Chemical Information Ontology (CHEMINF) aims to establish a standard in representing chemical information. In particular, it aims to produce an ontology to represent chemical structure and to richly describe chemical properties, whether intrinsic ...

Bacterial protein tYrosine Kinase database

The Bacterial protein tYrosine Kinase database (BYKdb) contains computer-annotated BY-kinase sequences. The database web interface allows static and dynamic queries and provides integrated analysis tools including sequence annotation.

macromolecular Crystallographic Information File

PDBx/mmCIF is a dictionary of data archiving macromolecule crystallographic experiments and their results.

RNA Ontology

RNAO is a controlled vocabulary pertaining to RNA function and based on RNA sequences, secondary and three-dimensional structures. The central aim of the RNA Ontology Consortium (ROC) is to develop an ontology to capture all aspects of RNA - from pri ...

Simplified Molecular Input Line Entry Specification Format

This format is an open specification version of the SMILES language, a typographical line notation for specifying chemical structure. It is hosted under the banner of the Blue Obelisk project, with the intent to solicit contributions and comments fro ...

Neuroimaging Informatics Tools and Resources Collaboratory Resources Registry

Neuroimaging Informatics Tools and Resources Collaboratory Resources Registry (NITRC-R) describes software tools and resources, vocabularies, test data, and databases. It is intended to extend the impact and longevity of previously funded neuroimagin ...

Carbohydrate Structure Database

The Carbohydrate Structure Database (CSDB) contains manually curated natural carbohydrate structures, taxonomy, bibliography, NMR data and more. The Bacterial (BCSDB) and Plant&Fungal (PFCSDB) databases were merged in 2015, becoming the CSDB, to impr ...

Coherent X-ray Imaging Data Bank

The Coherent X-ray Imaging Data Bank (CXIDB) offers scientists from all over the world a unique opportunity to access data from Coherent X-ray Imaging (CXI) experiments. It arose from the need to share the terabytes of data generated from X-ray free- ...


Genome3D is a resource that provides structural annotation and 3D models of genomes of model organisms such as human, yeast and E.coli. The database can be used to predict protein structures that have not yet been identified. Genome3D uses structural ...

Chemical Markup Language

CML (Chemical Markup Language) is an XML language designed to hold most of the central concepts in chemistry. It was the first language to be developed and plays the same role for chemistry as MathML for mathematics and GML for geographical systems. ...


ChannelsDB is a comprehensive and regularly updated resource of channels, pores and tunnels found in biomacromolecules deposited in the Protein Data Bank.

Dot Bracket Notation (DBN) - Vienna Format

The bracket notation for RNA secondary structures Pseudo-knot free secondary structures can be represented in the space-efficient bracket notation, which is used throughout the Vienna RNA package.

MDL molfile Format

An MDL Molfile is a file format for holding information about the atoms, bonds, connectivity and coordinates of a molecule. Each molfile describes a single molecular structure which can contain disjoint fragments. The V3000 molfile and V3000 rxnfile ...


WALTZ-DB 2.0 is a database for characterizing short peptides for their amyloid fiber-forming capacities. The majority of the data comes from electron microscopy, FTIR and Thioflavin-T experiments done by the Switch lab. Apart from that class of data ...


GlycoRDF is a standard representation for storing Glycomcis data (glycan structures, publication information, biological source information, experimental data) in RDF. The RDF language is defined by an OWL ontology and documented in the ontology and ...


T-psi-C is a database of tRNA sequences and 3D tRNA structures. The T-psi-C database can be continuously updated by any member of the scientific community.

Hierarchical Editing Language for Macromolecules

HELM (Hierarchical Editing Language for Macromolecules) enables the representation of a wide range of biomolecules (e.g. proteins, nucleotides, antibody drug conjugates) whose size and complexity render existing small-molecule and sequence-based info ...

Chemical Abstracts Service Registry

CAS REGISTRY is the most authoritative collection of disclosed chemical substance information. It covers substances identified from the scientific literature from 1957 to the present, with additional substances going back to the early 1900s. CAS REGI ...

Enzyme Structure Function Ontology

The ESFO provides a new paradigm for organizing enzyme sequence, structure, and function information, whereby specific elements of enzyme sequence and structure are mapped to specific conserved aspects of function, thus facilitating the functional an ...

Web3 Unique Representation of Carbohydrate Structures

The Web3 Unique Representation of Carbohydrate Structures (WURCS) defines a generalizable and unique linear representation for carbohydrate structures. A recent update (WURCS 2.0) was created to handle structural ambiguity around (potential) carbonyl ...


NucMap is a database of genome-wide nucleosome positioning across multiple species. Based on raw sequence data from published studies, NucMap integrates, analyzes, and visualizes nucleosome positioning data across species.

EcoliWiki: A Wiki-based community resource for Escherichia coli

EcoliWiki is a community-based resource for the annotation of all non-pathogenic E. coli, its phages, plasmids, and mobile genetic elements.

iPPI-DB - Inhibitors of Protein-Protein Interaction Database

IPPI-DB is a database of modulators of protein-protein interactions. It contains exclusively small molecules and therefore no peptides. The data are retrieved from the literature either peer reviewed scientific articles or world patents. A large vari ...

Collaborative Computing Project for NMR

The CCPN Data Model for macromolecular NMR is intended to cover all data needed for macromolecular NMR spectroscopy from the initial experimental data to the final validation. It serves for exchange of data between programs, for storage, data harvest ...

Structure Data Format

Structure Data Format (SDF) is a chemical file formats to represent multiple chemical structure records and associated data fields. SDF was developed and published by Molecular Design Limited (MDL) and became the most widely used standard for importi ...

PROtein-protein compleX MutAtion ThErmodynamics Database

PROXiMATE is a database of thermodynamic data for more than 6000 missense mutations in 174 heterodimeric protein-protein complexes, supplemented with interaction network data from STRING database, solvent accessibility, sequence, structural and funct ...

Japan Chemical Substance Dictionary

The Japan Chemical Substance Dictionary is an organic compound dictionary database prepared by the Japan Science and Technology Agency (JST).

ChemDraw Native File Format

CDX is the native file format of ChemDraw, and is guaranteed to save anything drawn in ChemDraw without loss of data. At the same time, however, its architecture was carefully designed to make it a flexible and general-purpose chemical format. It is ...

MolMeDB: Molecules on Membranes Database

MolMeDB is an open chemistry database concerning the interaction of molecules with membranes.

Imperial College Research Data Repository

A lightweight digital repository for data based on the concepts of collections of filesets. Both the collection and the fileset are assigned a DOI by the DataCite organisation which can be quoted in articles

Ontology for Genetic Interval

Using BFO (Basic Formal Ontology) as its upper-level ontology, the Ontology for Genetic Interval (OGI) represents gene as an entity with its 3D shape, topography, and primary DNA sequence as the foundation for its 3D structure. There is no official h ...

ICM binary file Format

ICM binary file Format is used in databases pertaining to structural biology and protein families. This format can be used for the graphical representation of RNA, DNA, and proteins interactions.

mini Protein Data Bank Format

"mini Protein Data Bank Format" is a standard. This text was generated automatically. If you work on the project responsible for "mini Protein Data Bank Format" then please consider helping us by claiming this record and updating it appropriately.

NIH 3D Print Exchange

Few scientific 3D-printable models are available online, and the expertise required to generate and validate such models remains a barrier. The NIH 3D Print Exchange eliminates this gap with an open, comprehensive, and interactive website for searchi ...

Reciprocal Net

The Reciprocal Net is a distributed database used by research crystallographers to store information about molecular structures; much of the data is available to the general public. The Reciprocal Net project is still under development. Currently, th ...

The CXI File Format for Coherent X-ray Imaging

The CXI file format was created as common format for all the data in the Coherent X-ray Imaging Data Bank (CXIDB). Naturally its scope is all experimental data collected during Coherent X-ray Imaging experiments as well as all data generated during th ...

