Source | Match | ReputationScore* |
---|---|---|
UCSC Genome Browser database
Genome assemblies and aligned annotations for a wide range of vertebrates and model organisms, along with an integrated tool set for visualizing, comparing, analyzing and sharing both publicly available and user-generated genomic datasets.
|
|
|
RCSB Protein Data Bank
This resource is powered by the Protein Data Bank archive-information about the 3D shapes of proteins, nucleic acids, and complex assemblies that helps students and researchers understand all aspects of biomedicine and agriculture, from protein synth
...
|
|
|
The International Genome Sample Resource
The International Genome Sample Resource (IGSR) was established to ensure the ongoing usability of data generated by the 1000 Genomes Project and to extend the data set. The 1000 Genomes Project ran between 2008 and 2015, creating the largest public
...
|
|
|
CLUSTAL-W Alignment Format
CLUSTAL-W Alignment Format is a simple text-based format, often with a *.aln file extension, used for the input and output of DNA or protein sequences into the Clustal suite of multiple alignment programs.
|
|
|
GenBank
GenBank is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. The complete release notes for the current version of GenBank are available on the NCBI ftp site. A new release is made every two months. G
...
|
|
|
Sequence Read Archive
The Sequence Read Archive (SRA) stores raw sequencing data from the next generation of sequencing platforms Data submitted to SRA. It is organized using a metadata model consisting of six objects: study, sample, experiment, run, analysis and submissi
...
|
|
|
Minimum Information for Publication of Quantitative Real-Time PCR Experiments
The aim of MIQE is to provide authors, reviewers and editors with specifications for the minimum information that must be reported for a qPCR experiment in order to ensure its relevance, accuracy, correct interpretation and repeatability.
|
|
|
Reference Sequence Database
The Reference Sequence (RefSeq) collection aims to provide a comprehensive, integrated, non-redundant, well-annotated set of sequences, including genomic DNA, transcripts, and proteins.
|
|
|
ARLEQUIN Project Format
Arlequin ver 3.0 is a software package integrating several basic and advanced methods for population genetics data analysis, like the computation of standard genetic diversity indices, the estimation of allele and haplotype frequencies, tests of depa
...
|
|
|
Worldwide Protein Data Bank
The Protein Data Bank (PDB) is an archive of experimentally determined three-dimensional structures of biological macromolecules that serves a global community of researchers, educators, and students. The data contained in the archive include atomic
...
|
|
|
Minimum Information About a Microarray Experiment
MIAME is intended to specify all the information necessary for an unambiguous interpretation of a microarray experiment, and potentially to reproduce it. MIAME defines the content but not the format for this information.
|
|
|
Greengenes
A 16S rRNA gene database which provides chimera screening, standard alignment, and taxonomic classification using multiple published taxonomies.
|
|
|
Database resources of the National Center for Biotechnology Information
The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts publish
...
|
|
|
FASTA Sequence Format
FASTA format is a text-based format for representing either nucleotide sequences or peptide sequences, in which nucleotides or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede th
...
|
|
|
European Nucleotide Archive
The European Nucleotide Archive (ENA) is a globally comprehensive data resource for nucleotide sequence, spanning raw data, alignments and assemblies, functional and taxonomic annotation and rich contextual data relating to sequenced samples and expe
...
|
|
|
BLAST-like Alignment Tool Format
BLAT is a multiple algorithms developed for the analysis and comparison of biological sequences such as DNA, RNA and proteins.
|
|
|
The Cancer Genome Atlas
The Cancer Genome Atlas (TCGA) is a comprehensive, collaborative effort led by the National Institutes of Health (NIH) to map the genomic changes associated with specific types of tumors to improve the prevention, diagnosis and treatment of cancer. I
...
|
|
|
Pseudomonas Genome DB
The Pseudomonas Genome Database is a resource for peer-reviewed, continually updated annotation for all Pseudomonas species. It includes gene and protein sequence information, as well as regulation and predicted function and annotation.
|
|
|
BioModels
BioModels is a repository of computational models of biological processes. It allows users to search and retrieve mathematical models published in the literature. Many models are manually curated (to ensure reproducibility) and extensively cross-link
...
|
|
|
NCBI Gene
The Entrez Global Query Cross-Database Search System is a federated search engine, or web portal that allows users to search many discrete health sciences databases at the National Center for Biotechnology Information (NCBI) website. Entrez can effic
...
|
|
|
Protein Data Bank in Europe
The Protein Data Bank in Europe (PDBe) is the European resource for the collection, organisation and dissemination of data on biological macromolecular structures. It is a founding member of the worldwide Protein Data Bank which collects, organises a
...
|
|
|
PeptideAtlas
The PeptideAtlas Project provides a publicly-accessible database of peptides identified in tandem mass spectrometry proteomics studies and software tools. Mass spectrometer output files are collected for human, mouse, yeast, and several other organis
...
|
|
|
Sequence Ontology
SO is a collaborative ontology project for the definition of sequence features used in biological sequence annotation. The Sequence Ontology is a set of terms and relationships used to describe the features and attributes of biological sequence. SO i
...
|
|
|
Ensembl Genomes
The Ensembl genome annotation system, developed jointly by EMBL-EBI and the Wellcome Trust Sanger Institute, has been used for the annotation, analysis and display of vertebrate genomes since 2000. Since 2009, the Ensembl site has been complemented b
...
|
|
|
VISTA
Comprehensive suite of programs and databases for comparative analysis of genomic sequences. There are two ways of using VISTA - you can submit your own sequences and alignments for analysis (VISTA servers) or examine pre-computed whole-genome alignm
...
|
|
|
NONCODE
NONCODE is a database of noncoding RNAs (except tRNAs and rRNAs), including long noncoding (lnc) RNAs. Information contained within the database includes human lncRNA–disease relationships and single nucleotide polymorphism-lncRNA–disease relationshi
...
|
|
|
Pathway Commons
Pathway Commons is a convenient point of access to biological pathway information collected from public pathway databases. Information is sourced from public pathway databases and is readily searched, visualized, and downloaded. The data is freely av
...
|
|
|
Rat Genome Database
The Rat Genome Database stores genetic, genomic, phenotype, and disease data generated from rat research. It provides access to corresponding data for eight other species, allowing cross-species comparison. Data curation is performed both manually an
...
|
|
|
Variant Effect Predictor data format
A text format devised by Ensembl for the eponymous Variant Effect Predictor tool.
|
|
|
ENCODE Project
The ENCODE (Encyclopedia of DNA Elements) Consortium is an international collaboration of research groups funded by the National Human Genome Research Institute (NHGRI). The goal of ENCODE is to build a comprehensive parts list of functional elements
...
|
|
|
FASTQ Sequence and Sequence Quality Format
FASTQ is a text-based file format for sharing sequencing data combining both the sequence and an associated per base quality score.
|
|
|
Database of genomic structural VARiation
dbVar is a database of human genomic structural variation where users can search, view, and download data from submitted studies. dbVar stopped supporting data from non-human organisms in 2017, however existing non-human data remains available. In ke
...
|
|
|
Ensembl Protists
Ensembl Protists stores protist genomes of interest, covering those involved in disease and of scientific interest. This includes genomes such as Plasmodium falciparum, Dictyostelium discoideum, Phytophthora infestans and Leishmania major. A majority
...
|
|
|
Ensembl Metazoa
Ensembl Metazoa provides access to genomes of metazoans of interest in disease, environmental sciences, agriculture and economic concern. Extensive coverage exists of diptera, nematodes, lepidoptera and hymenoptera.
|
|
|
Nucleic Acids Database
The Nucleic Acids Database contains information about experimentally-determined nucleic acids and complex assemblies. NDB can be used to perform searches based on annotations relating to sequence, structure and function, and to download, analyze, and
...
|
|
|
Restriction enzymes and methylases database
A collection of information about restriction enzymes and related proteins. It contains published and unpublished references, recognition and cleavage sites, isoschizomers, commercial availability, methylation sensitivity, crystal, genome, and sequen
...
|
|
|
The Autism Chromosome Rearrangement Database
The Autism Chromosome Rearrangement Database is a collection of hand curated breakpoints and other genomic features, related to autism, taken from publicly available literature, databases and unpublished data.
|
|
|
The Protein Database
The Entrez Protein search and retrieval system contains protein entries that have been compiled from a variety of sources, including SwissProt, PIR, PRF, PDB, and translations from annotated coding regions in GenBank and RefSeq.
|
|
|
Expressed Sequence Tags database
The dbEST contains sequence data and other information on "single-pass" cDNA sequences, or "Expressed Sequence Tags", from a number of organisms. NCBI is in the process of merging EST and GSS records into the Nucleotide database, and the process is e
...
|
|
|
Minimum Information about a MARKer gene Sequence
MIMARKS is the metadata reporting standard of the Genomic Standards Consortium that covers marker gene sequences from environmental surveys or individual organisms
|
|
|
The Barcode of Life Data Systems
The Barcode of Life Data Systems (BOLD) is an online workbench that aids collection, management, analysis, and use of DNA barcodes. It consists of 3 components (MAS, IDS, and ECS) that each address the needs of various groups in the barcoding communi
...
|
|
|
Minimum Information about a Molecular Interaction Experiment
MIMIx is a community guideline advising the user on how to fully describe a molecular interaction experiment and which information it is important to capture. The document is designed as a compromise between the necessary depth of information to desc
...
|
|
|
Maize Genetics and Genomics Database
MaizeGDB is the maize research community's central repository for genetics and genomics information.
|
|
|
Homologene
HomoloGene is an automated system for the detection of homologs among the annotated genes of several completely sequenced eukaryotic genomes. HomoloGene takes protein sequences from differing species and compares them to one another (using blastp) an
...
|
|
|
SoyBase
SoyBase, the USDA-ARS soybean genetic database, is a comprehensive repository for professionally curated genetics, genomics and related data resources for soybean. SoyBase contains genetic, physical and genomic sequence maps integrated with qualitati
...
|
|
|
Allele Frequency Net Database
The Allele Frequency Net Database (AFND) provides the scientific community with a freely available repository for the storage of frequency data (alleles, genes, haplotypes and genotypes) related to human leukocyte antigens (HLA), killer-cell immunogl
...
|
|
|
Ensembl Fungi
Ensembl Fungi is a browser for fungal genomes. A majority of these are taken from the databases of the International Nucleotide Sequence Database Collaboration (the European Nucleotide Archive at the EBI, GenBank at the NCBI, and the DNA Database of
...
|
|
|
Ensembl Bacteria
Ensembl Bacteria is a browser for bacterial and archaeal genomes. These are taken from the databases of the International Nucleotide Sequence Database Collaboration (the European Nucleotide Archive at the EBI, GenBank at the NCBI, and the DNA Databas
...
|
|
|
Group II introns database
Database for identification and cataloguing of group II introns. All bacterial introns listed are full-length and appear to be functional, based on intron RNA and IEP characteristics. The database names the full-length introns, and provides informati
...
|
|
|
Leiden Open Variation Database
The Leiden Open Variation Database (LOVD) provides a flexible, freely available tool for gene-centered collection and display of DNA variations. LOVD also stores patient-centered data, NGS data, and variants outside of genes.
|
|
|
Yeast Searching for Transcriptional Regulators and Consensus Tracking
YEASTRACT (Yeast Search for Transcriptional Regulators And Consensus Tracking) is a curated repository of more than 48333 regulatory associations between transcription factors (TF) and target genes in Saccharomyces cerevisiae, based on more than 1200
...
|
|
|
PLEXdb
PLEXdb (Plant Expression Database) is a unified gene expression resource for plants and plant pathogens. PLEXdb is a genotype to phenotype, hypothesis building information warehouse, leveraging highly parallel expression data with seamless portals to
...
|
|
|
DNA Data Bank of Japan
An annotated collection of all publicly available nucleotide and protein sequences. DDBJ collects sequence data mainly from Japanese researchers, as well as researchers in other countries. DDBJ is part of the International Nucleotide Sequence Databas
...
|
|
|
Universal PBM Resource for Oligonucleotide Binding Evaluation
The UniPROBE (Universal PBM Resource for Oligonucleotide Binding Evaluation) database hosts data generated by universal protein binding microarray (PBM) technology on the in vitro DNA binding specificities of proteins.
|
|
|
Minimum Information about any (x) Sequence
The minimum information about any (x) sequence (MIxS) is an overarching framework of sequence metadata, that includes technology-specific checklists from the previous MIGS and MIMS standards, provides a way of introducing additional checklists such a
...
|
|
|
Database of Orthologous Groups
OrthoDB presents a catalog of eukaryotic orthologous protein-coding genes. Orthology refers to the last common ancestor of the species under consideration, and thus OrthoDB explicitly delineates orthologs at each radiation along the species phylogeny
...
|
|
|
Dfam
The Dfam database is a open collection of DNA Transposable Element sequence alignments, hidden Markov Models (HMMs), consensus sequences, and genome annotations. Dfam represents a collection of multiple sequence alignments, each containing a set of r
...
|
|
|
RegulonDB
RegulonDB is a model of the complex regulation of transcription initiation or regulatory network of the cell. On the other hand, it is also a model of the organization of the genes in transcription units, operons and simple and complex regulons. In t
...
|
|
|
MycoCosm
MycoCosm provides data access, visualization, and analysis tools for comparative genomics of fungi. MycoCosm enables users to navigate across sequenced fungal genomes, and to conduct comparative and genome-centric analyses of fungi and community anno
...
|
|
|
Genome Sequence Archive
GSA is a data repository specialized for archiving raw sequence reads. It supports data generated from a variety of sequencing platforms ranging from Sanger sequencing machines to single-cell sequencing machines and provides data storing and sharing
...
|
|
|
Ribosomal Database Project (RDP-II)
The Ribosomal Database Project - II (RDP-II)(1) provides data, tools and services related to ribosomal RNA sequences to the research community. Through its website (http://rdp.cme.msu.edu), RDP-II offers aligned and annotated rRNA sequence data, anal
...
|
|
|
Molecular Modeling Database
The Molecular Modeling Database (MMDB), as part of the Entrez system, facilitates access to structure data by connecting them with associated literature, protein and nucleic acid sequences, chemicals, biomolecular interactions, and more.
|
|
|
Variant Call Format
Variant Call Format (VCF) is a text file format (most likely stored in a compressed manner). It contains meta-information lines, a header line, and then data lines each containing information about a position in the genome.
|
|
|
Influenza Virus Resource
Influenza Virus Resource presents data obtained from the NIAID Influenza Genome Sequencing Project as well as from GenBank, combined with tools for flu sequence analysis, annotation and submission to GenBank. In addition, it provides links to other r
...
|
|
|
NCBI Viral Genomes Resource
NCBI Viral Genomes Resource is a collection of virus genomic sequences that provides curated sequence data, related information and tools. It includes all complete viral genome sequences deposited in the International Nucleotide Sequence Database Col
...
|
|
|
Molecular database for the identification of fungi
UNITE is primarily a fungal rDNA internal transcribed spacer (ITS) sequence database, although they also welcome additional genes and genetic markers. UNITE focuses on high-quality ITS sequences generated from fruiting bodies collected and identified
...
|
|
|
BBMRI-ERIC Directory
The BBMRI-ERIC Directory is a tool to share aggregate information about biobanks across Europe. The Directory welcomes new biobanks to join and publish information about themselves, including their contact information in the directory. The BBMRI-ERIC
...
|
|
|
Ensembl Plants
Ensembl Plants holds the genomes of plants of significant interest. These range from those of agricultural importance, those which support primary research and of environmental interest. Ensembl Plants datasets are constructed in a direct collaborati
...
|
|
|
e-Mouse Atlas of Gene Expression
The e-Mouse Atlas of Gene Expression (EMAGE) is a freely available database of in situ gene expression patterns that allows users to perform online queries of mouse developmental gene expression. EMAGE is unique in providing both text-based descripti
...
|
|
|
Addgene
Addgene is a non-profit plasmid repository dedicated to helping scientists around the world share high-quality plasmids. Addgene are working with thousands of laboratories to assemble a high-quality library of published plasmids for use in research a
...
|
|
|
TAIR annotation data Format
At TAIR, we display Gene Ontology and Plant Ontology annotations made by TAIR curators and those made by the community including individual researchers and contributors to the GO Consortium. The GO annotations in TAIR are made using a combination of
...
|
|
|
Rice Genome Annotation Project
This website provides genome sequence from the Nipponbare subspecies of rice and annotation of the 12 rice chromosomes. These data are available through search pages and the Genome Browser that provides an integrated display of annotation data.
|
|
|
dictyBase
dictyBase is a single-access database for the complete genome sequence and expression data of four Dictyostelid species providing information on research, genome and annotations. There is also a repository of plasmids and strains held at the Dicty St
...
|
|
|
UniVec
UniVec is a database that can be used to quickly identify segments within nucleic acid sequences which may be of vector origin (vector contamination). In addition to vector sequences, UniVec also contains sequences for those adapters, linkers, and pr
...
|
|
|
BioCyc
The BioCyc collection of Pathway/Genome Databases (PGDBs) provides electronic reference sources on the pathways and genomes of hundreds of organisms with completely sequenced genomes. Each database contains the genome, predicted metabolic pathways, p
...
|
|
|
ProtClustDB
ProtClustDB is a collection of related protein sequences (clusters) consisting of Reference Sequence proteins encoded by complete genomes. This database contains both curated and non-curated clusters.
|
|
|
Global Genome Biodiversity Network Data Standard
The GGBN Data Standard is a set of terms and controlled vocabularies designed to represent sample facts. It does not cover e.g., scientific name, geography, or physiological facts. This allows combining the GGBN Data Standard with other complementary
...
|
|
|
AceView Worm Genome
AceView provides a curated, comprehensive and non-redundant sequence representation of all public mRNA sequences (mRNAs from GenBank or RefSeq, and single pass cDNA sequences from dbEST and Trace). These experimental cDNA sequences are first co-align
...
|
|
|
Minimum Information About a Microarray Experiment involving Plants
MIAME/Plant is a standard describing which biological details should be captured for describing microarray experiments involving plants. Detailed information is required about biological aspects such as growth conditions, harvesting time or harvested
...
|
|
|
BAliBASE
BAliBASE; a benchmark alignment database, including enhancements for repeats, transmembrane sequences and circular permutations.
|
|
|
Fungal and Oomycete genomics resource
FungiDB is an integrated genomic and functional genomic database for the kingdom Fungi. The database integrates whole genome sequence and annotation and also includes experimental and environmental isolate sequence data. The database includes compara
...
|
|
|
Genetic Codes
NCBI takes great care to ensure that the translation for each coding sequence (CDS) present in GenBank records is correct. Central to this effort is careful checking on the taxonomy of each record and assignment of the correct genetic code for each o
...
|
|
|
Human Endogenous Retrovirus database
This database is compiled from the human genome nucleotide sequences obtained mostly in the Human Genome Projects. The database makes it possible to continuously improve classification and characterization of retroviral families. The HERV database no
...
|
|
|
Infevers
A registry of Hereditary Auto-inflammatory Disorder Mutations.
|
|
|
PROkariotIC Database Of Gene-Regulation
PRODORIC is a comprehensive database about gene regulation and gene expression in prokaryotes. It includes a manually curated and unique collection of transcription factor binding sites.
|
|
|
MethylomeDB
This resource details DNA methylation profiles in human and mouse brain. This database includes genome-wide DNA methylation profiles for human and mouse brains. The DNA methylation profiles were generated by Methylation Mapping Analysis by Paired-end
...
|
|
|
STrengthening the REporting of Genetic Association Studies
The STrengthening the REporting of Genetic Association studies (STREGA) initiative builds on the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement and provides additions to 12 of the 22 items on the STROBE checkl
...
|
|
|
A Systematic Annotation Package
ASAP is a relational database and web interface developed to store, update and distribute genome sequence data and gene expression data. It was designed to facilitate ongoing community annotation of genomes and to grow with genome projects as they mo
...
|
|
|
MINAS - A Database of Metal Ions in Nucleic AcidS
MINAS contains the exact geometric information on the first and second-shell coordinating ligands of every metal ion present in nucleic acid structures that are deposited in the PDB and NDB. Containing also the sequence information of the binding poc
...
|
|
|
Database of Genomic Variants archive (DGVa)
The DGVa team accepts direct submissions from researchers and also curates data from the published literature. As part of a regular exchange, DGVa data is sent to its partner archive, dbVar (hosted by the National Center for Biotechnology Information
...
|
|
|
The Chromosome 7 Annotation Project
The objective of this project is to generate the most comprehensive description of human chromosome 7 to facilitate biological discovery, disease gene research and medical genetic applications.
|
|
|
Integrated Resource for Reproducibility in Macromolecular Crystallography
The Integrated Resource for Reproducibility in Macromolecular Crystallography (IRRMC) was created to make the raw data of protein crystallography more widely available. The IRRMC identifies, catalogs and provides metadata related to datasets that cou
...
|
|
|
Online Mendelian Inheritance in Animals
Online Mendelian Inheritance in Animals is a a database of inherited disorders, other (single-locus) traits, and genes in animal species (other than human and mouse).
|
|
|
Integrative and Conjugative Elements in Bacteria
A web-based resource for integrative and conjugative elements (ICEs) found in bacteria. It collates available data from experimental and bioinformatics analyses, and literature, about known and putative ICEs in bacteria as a PostgreSQL-based database
...
|
|
|
DNASU Plasmid Repository
DNASU is a central repository for plasmid clones and collections. Currently we store and distribute over 197,000 plasmids including 75,000 human and mouse plasmids, full genome collections, the protein expression plasmids from the Protein Structure I
...
|
|
|
cis-Regulatory Element Database
The cisRED database holds conserved sequence motifs identified by genome scale motif discovery, similarity, clustering, co-occurrence and coexpression calculations. Sequence inputs include low-coverage genome sequence data and ENCODE data.
|
|
|
Human Mitochondrial Database
HmtDB is an open resource created to support population genetics and mitochondrial disease studies. The database hosts human mitochondrial genome sequences annotated with population and variability data, the latter being estimated through the applica
...
|
|
|
PAZAR
PAZAR is a software framework for the construction and maintenance of regulatory sequence data annotations; a framework which allows multiple boutique databases to function independently within a larger system (or information mall). The goal of PAZAR
...
|
|
|
The DNA Replication Origin Database
This database summarizes our knowledge of replication origins in the budding yeast Saccharomyces cerevisiae. Each proposed origin site has been assigned a Status (Confirmed, Likely, or Dubious) expressing the confidence that the site genuinely corres
...
|
|
|
Signaling Pathway Integrated Knowledge Engine
SPIKE (Signaling Pathway Integrated Knowledge Engine) is an interactive software environment that graphically displays biological signaling networks, allows dynamic layout and navigation through these networks, and enables the superposition of DNA mi
...
|
|
|
Plant DNA C-values database
A database containing genome size (C-value) data for all groups of land plants and red, green and brown algae.
|
|
|
IPD-KIR - Killer-cell Immunoglobulin-like Receptors
The database provides a centralised repository for human KIR sequences. Killer-cell Immunoglobulin-like Receptors (KIR) have been shown to be highly polymorphic at the allelic and haplotypic level. KIRs are members of the immunoglobulin superfamily (
...
|
|
|
Genomic Contextual Data Markup Language
The Genomic Contextual Data Markup Language (GCDML) is a core project of the Genomic Standards Consortium (GSC) that is a reference implementation the Minimum Information about a Genome Sequence (MIGS/MIMS/MIMARKS), and the extensions the Minimum Inf
...
|
|
|
Minimal Information About a Phylogenetic Analysis
The MIAPA (minimum information about a phylogenetic analysis) checklist details the list of metadata necessary for researchers to evaluate or reuse a published phylogeny.
|
|
|
MicrosporidiaDB
MicrosporidiaDB is one of the databases that can be accessed through the EuPathDB (http://EuPathDB.org; formerly ApiDB) portal, covering eukaryotic pathogens of the genera Cryptosporidium, Giardia, Leishmania, Neospora, Plasmodium, Toxoplasma, Tricho
...
|
|
|
CryptoDB
CryptoDB serves as the functional genomics database for Cryptosporidium and related species. CryptoDB is a free, online resource for accessing and exploring genome sequence and annotation, functional genomics data, isolate sequences, and orthology pr
...
|
|
|
Generic Feature Format Version 3
The Generic Feature Format Version 3 (GFF3) format was developed after earlier formats, although widely used, became fragmented into multiple incompatible dialects. The GFF3 format addresses the most common extensions to GFF, while preserving backwar
...
|
|
|
Binary Alignment Map Format
BAM is the compressed binary version of the Sequence Alignment/Map (SAM) format, a compact and indexable representation of nucleotide sequence alignments. Many next-generation sequencing and analysis tools work with SAM/BAM. For custom track display,
...
|
|
|
Bovine Genome Database
The Bovine Genome Database project is developed to support the efforts of bovine genomics researchers by providing data mining, genome navigation and annotation tools for the bovine reference genome based on the hereford cow, L1 Dominette 01449.
|
|
|
BacMap
BacMap is a picture atlas of annotated bacterial genomes. It is an interactive visual database containing hundreds of fully labeled, zoomable, and searchable maps of bacterial genomes.
|
|
|
Distributed Sequence Annotation System
The Distributed Annotation System (DAS) defines a communication protocol used to exchange annotations on genomic or protein sequences.
|
|
|
Telomerase Database
The Telomerase Database is a Web-based tool for the study of structure, function, and evolution of the telomerase ribonucleoprotein. The objective of this database is to serve the research community by providing a comprehensive compilation of informa
...
|
|
|
MitoFish
Mitochondrial genome database of fish with an accurate and automatic annotation pipeline.
|
|
|
SynBioHub
SynBioHub is a design repository for people designing biological constructs. It enables DNA and protein designs to be uploaded, then provides a shareable link to allow others to view them. SynBioHub also facilitates searching for information about ex
...
|
|
|
Minimal Metagenome Sequence Analysis Standard
A proposed set of minimal standard analyses necessary for proper interpretation of meta-omic data and to allow comparative metagenomics and metatranscriptomics. Please note: We cannot find an up-to-date website for this resource. As such, we have mar
...
|
|
|
Human disease methylation database
The human disease methylation database, DiseaseMeth is a web based resource focused on the aberrant methylomes of human diseases. Until recently, bulks of large-scale data are avaible and are increasingly grown, from which more information can be min
...
|
|
|
ChimerDB
ChimerDB is a database of fusion sequences encompassing bioinformatics analysis of mRNA and EST sequences in the GenBank, manual collection of literature data and integration with other well known databases. Fusion transcripts with nonoverlapping ali
...
|
|
|
Tandem Repeats Database
Tandem Repeats Database (TRDB) is a public repository of information on tandem repeats in genomic DNA and contains a variety of tools for their analysis.
|
|
|
DNA Methylation Interactive Visualization Database
DNMIVD is a comprehensive annotation and interactive visualization database for DNA methylation profile of diverse human cancer constructed with high throughput microarray data from TCGA and GEO databases, and it also integrates some data from Pancan
...
|
|
|
Database of Rice Transcription Factors
DRTF contains 2025 putative transcription factors (TFs) in Oryza sativa L. ssp. indica and 2384 in ssp. japonica, distributed in 63 families, identified by computational prediction and manual curation. It includes detailed annotations of each TF incl
...
|
|
|
Homologous Vertebrate Genes Database
HOVERGEN is a database of homologous vertebrate genes that allows one to select sets of homologous genes among vertebrate species, and to visualize multiple alignments and phylogenetic trees.
|
|
|
Plant Genomics and Phenomics Research Data Repository
This repository provides several plant genomic and phenotypic datasets resulting from IPK and German Plant Phenotyping Network (DPPN) research activities. It was established in January 2015. The background of the study is in plant genetic resources,
...
|
|
|
The Database of Human DNA Methylation and Cancer
The database of human DNA Methylation and Cancer (MethyCancer) is developed to study interplay of DNA methylation, gene expression and cancer. It hosts both highly integrated data of DNA methylation, cancer-related gene, mutation and cancer informati
...
|
|
|
Mouse Atlas of Gene Expression
<<<!!!<<< This repository is no longer available >>>!!!>>>
|
|
|
4DGenome
4DGenome is a public database that archives and disseminates chromatin interaction data. Currently, 4DGenome contains over 8,038,248 interactions curated from both experimental studies (high throughput and individual studies) and computational predic
...
|
|
|
Beijing Genomics Institute Rice Information System
In BGI-RIS, sequence contigs of Beijing indica and Syngenta japonica have been further assembled and anchored onto the rice chromosomes. The database has annotated the rice genomes for gene content, repetitive elements, and SNPs. Sequence polymorphis
...
|
|
|
American Type Culture Collection database
ATCC authenticates microorganisms and cell lines and manages logistics of long-term preservation and distribution of cultures for the scientific community. ATCC supports the cultures it acquires and authenticates with expert technical support, intell
...
|
|
|
GrainGenes, a Database for Triticeae and Avena
The GrainGenes website hosts a wealth of information for researchers working on Triticeae species, oat and their wild relatives. The website hosts a database encompassing information such as genetic maps, genes, alleles, genetic markers, phenotypic d
...
|
|
|
Polbase
Polbase is an open and searchable database providing information from published and unpublished sources on the biochemical, genetic, and structural information of DNA polymerases.
|
|
|
The mitochondrial DNA breakpoints database
A comprehensive on-line resource with curated datasets of mitochondrial DNA (mtDNA) rearrangements.
|
|
|
Minimal Information about a high throughput SEQuencing Experiment
MINSEQE describes the Minimum Information about a high-throughput nucleotide SEQuencing Experiment that is needed to enable the unambiguous interpretation and facilitate reproduction of the results of the experiment. By analogy to the MIAME guideline
...
|
|
|
NIDA Center on Genetics Studies
This resource stores and distributes clinical data and biomaterials (DNA samples and cell lines) available in the NIDA Genetics Initiative. This includes blood and other biospecimens along with phenotypic data.
|
|
|
WGE
A CRISPR database for genome engineering.
|
|
|
Variation Ontology
Variation Ontology, VariO, is an ontology for standardized, systematic description of effects, consequences and mechanisms of variations. VariO allows unambiguous description of variation effects as well as computerized analyses over databases utiliz
...
|
|
|
BCCM/MUCL Agro-food & Environmental Fungal Collection
BCCM/MUCL is a generalist fungal culture collection of over 30 000 filamentous fungi, yeasts and arbuscular mycorrhizal fungi including type, reference and test strains. The collections activities include the distribution of its holdings, the accessi
...
|
|
|
Human Disease-Related Viral Integration Sites
Dr.VIS collects and locates human disease-related viral integration sites. So far, about 600 sites covering 5 virus organisms and 11 human diseases are available. Integration sites in Dr.VIS are located against chromesome, cytoband, gene and refseq p
...
|
|
|
Real-time PCR Data Markup Language
The RDML file format is developed by the RDML consortium (http://www.rdml.org) and can be used free of charge. The RDML file format was created to encourage the exchange, publication, revision and re-analysis of raw qPCR data. The core of an RDML fil
...
|
|
|
DistiLD Database: Diseases and Traits in Linkage Disequilibrium Blocks
The DistiLD database aims to increase the usage of existing genome-wide association studies (GWAS) results by making it easy to query and visualize disease-associated SNPs and genes in their chromosomal context.
|
|
|
non-B DB
non-B DNA forming motifs in mammalian genomes
|
|
|
HHMD
Human Histone Modification Database
|
|
|
PolyDoms
An integrated database of human coding single nucleotide polymorphisms (SNPs) and their annotations.
|
|
|
NGSmethDB
Next-generation sequencing single-cytosine-resolution DNA methylation data
|
|
|
Minimum Information About a Genotyping Experiment
MIGen recommends the standard information required to report a genotyping experiment, covering: study and experiment design, subject information, sample collection and processing, genotyping procedure, and data analysis methods, if applicable.
|
|
|
dbCRID
Database of Chromosomal Rearrangements In Diseases
|
|
|
REPAIRtoire
DNA repair pathways of human, yeast and E.coli
|
|
|
Influenza Virus Database
IVDB hosts complete genome sequences of influenza A virus generated by BGI and curates all other published influenza virus sequences after expert annotations. IVDB provides a series of tools and viewers for analyzing the viral genomes, genes, genetic
...
|
|
|
NCBI Epigenomics
>>>!!! <<< The Epigenomics database was retired on June 1, 2016.
All epigenomics data are available in our GEO resource https://www.ncbi.nlm.nih.gov/geo >>> !!! <<< The Epigenomics database provides genomics maps of stable and reprogrammable nucl
...
|
|
|
Human Unidentified Gene-Encoded large proteins database
HUGE is a database for human large proteins newly identified in the Kazusa cDNA project, the aim of which is to predict the primary structure of proteins from the sequences of human large cDNAs (>4 kb).
|
|
|
ROdent Unidentified Gene-Encoded large proteins
The ROUGE protein database is a sister database of HUGE protein database which has accumulated the results of comprehensive sequence analysis of human long cDNAs (KIAA cDNAs). The ROUGE protein database has been created to publicize the information o
...
|
|
|
The Diatom EST Database
Diatoms are photosynthetic unicellular eukaryotes that play an essential role in marine ecosystems. On a global scale, they generate around one fifth of the oxygen we breathe. On this web site we present searchable databases of diatom ESTs (expressed
...
|
|
|
BEI Resource Repository
BEI Resources provides reagents, tools and information for studying Category A, B, and C priority pathogens, emerging infectious disease agents, non-pathogenic microbes and other microbiological materials of relevance to the research community.
|
|
|
YanHuang - YH1 Genome Database
The YH database presents the entire DNA sequence of a Han Chinese individual, as a representative of Asian population. This genome, named as YH, is the start of YanHuang Project, which aims to sequence 100 Chinese individuals in 3 years.assembled bas
...
|
|
|
WholeCellSimDB
WholeCellSimDB is a database of whole-cell model simulations designed to make it easy for researchers to explore and analyze whole-cell model predictions including predicted: - Metabolite concentrations, - DNA, RNA and protein expression, - DNA-bound
...
|
|
|
Erythropoiesis Database
EpoDB (Erythropoiesis database) is a database of genes that relate to vertebrate red blood cells. It includes DNA sequence, structural features, protein information, gene expression information and transcription factor binding sites.
|
|
|
Oryza Tag Line
Oryza Tag Line consists in a searchable database developed under the Oracle management system integrating phenotypic data resulting from the evaluation of the Genoplante rice insertion line library.
|
|
|
NPInter v4.0
An integrated database of ncRNA interactions.
Noncoding RNAs (ncRNAs) play crucial regulatory roles in a variety of biological circuits. To document regulatory interactions between ncRNAs and biomolecules, we previously created the NPInter database
...
|
|
|
Chicken Variation Database
The chicken Variation Database (ChickVD) is an integrated information system for storage, retrieval, visualization and analysis of chicken variation data.
|
|
|
SilkDB 3.0
SilkDB is a database of the integrated genome resource for the silkworm, Bombyx mori. This database provides access to not only genomic data including functional annotation of genes, gene products and chromosomal mapping, but also extensive biologica
...
|
|
|
DNA Methylation Database
The database contains information about the occurrence of methylated cytosines in the DNA.
|
|
|
SpliceInfo
The database provides a means of investigating alternative splicing and can be used for identifying alternative splicing - related motifs, such as the exonic splicing enhancer (ESE), the exonic splicing silencer (ESS) and other intronic splicing moti
...
|
|
|
MouseIndelDB
Mouse Indel Polymorphism Database
|
|
|
RSSsite
Reference database and prediction tool for the identification of cryptic recombination signal sequences (RSSs) in the human and mouse genomes.
|
|
|
OryGenesDB: an interactive tool for rice reverse genetics
The aim of this Oryza sativa database was first to display sequence information such as the T-DNA and Ds flanking sequence tags (FSTs) produced in the framework of the French genomics initiative Genoplante and the EU consortium Cereal Gene Tags. This
...
|
|
|
Drosophila polymorphism database
Drosophila Polymorphism Database, is a secondary database designed to provide a collection of all the existing polymorphic sequences in the Drosophila genus. It allows, for the first time, the search for any polymorphic set according to different par
...
|
|
|
Minimal Information for QTLs and Association Studies
The MIQAS set of rules accompanied with the standardized XML and tab-delimited file formats will serve two goals: to encourage research groups that wish to publish a QTL paper to provide and submit the necessary information that would make meta-analy
...
|
|
|
BCCM/GeneCorner Plasmid Collection
The BCCM/GeneCorner Plasmid Collection warrants the long-term storage and distribution of plasmids, microbial host strains and DNA libraries of fundamental, biotechnological, educational or general scientific importance. The focus is on the collectio
...
|
|
|
HumCFS
Manually curated database of human chromosomal fragile sites. HumCFS provides useful information on fragile sites such as coordinates on the chromosome, cytoband, their chemical inducers and frequency of fragile site (rare or common), genes and miRNA
...
|
|
|
3DNALandscapes
Conformational features of DNA
|
|
|
BEDgraph
The bedGraph format allows display of continuous-valued data in track format. This display type is useful for probability scores and transcriptome data. This track type is similar to the wiggle (WIG) format, but unlike the wiggle format, data exporte
...
|
|
|
DNAproDB
An expanded database and web-based tool for structural analysis of DNA-protein complexes.
A Database and Web Tool for Structural Analysis of DNA-Protein Complexes.
DNAproDB automatically lays out nucleotide and residue interactions maps elegantly.
...
|
|
|
Transmembrane Helices in Genome Sequences
A web based database of Transmembrane Helices in Genome Sequences.
|
|
|
IPD-NHKIR - Non-Human Killer-cell Immunoglobulin-like Receptors
The IPD-NHKIR database provides a centralised repository for non-human KIR (NHKIR) sequences. Killer-cell Immunoglobulin-like Receptors (KIR) have been shown to be highly polymorphic at the allelic and haplotypic level. KIRs are members of the immuno
...
|
|
|
PADS Arsenal
A database of prokaryotic defense systems related genes.
Procaryotic Antiviral Defense System.
procaryotic database, defense system database, pangenome database.
PADS Arsenal A Database of Prokaryotic Defense Systems Related Genes.
|
|
|
Database Of Local Biomolecular Conformers
Dolbico, the Database Of Local Biomolecular Conformers, stores DNA structural data including the information about DNA local spatial arrangement. The main aim of Dolbico is the exploration of DNA structure at a local level. The analysis of local DNA
...
|
|
|
Single Nucleotide Polymorphism Ontology
The SNP Ontology is a domain ontology that provides a formal representation (OWL-DL) of genomic variations. Despite its name, SNP-Ontology, is not limited to the representation of SNPs but it encompasses genomic variations in a broader meaning. The S
...
|
|
|
DNAmoreDB
DNAzymes i.e. DNA molecules with catalytic activity
|
|
|
Pig Genomic Informatics System
The Pig Genomic Informatics System (PigGIS) presents accurate pig gene annotations in all sequenced genomic regions. It integrates various available pig sequence data, including 3.84 million whole-genome-shortgun (WGS) reads and 0.7 million Expressed
...
|
|
|
Eukaryotic Paralog Group Database
The database is gene-centered and organized by paralog family. It focused on the paralogs and the duplication events in the evolution. The paralog families and paralogons can be searched by text or sequence, and are downloadable from the website in p
...
|
|
|
Database of local DNA conformers
Dolce is a database of DNA structure motifs based on an automatic classification method consisting of the combination of supervised and unsupervised approaches. This workflow has been applied to analyze 816 X-ray and 664 NMR DNA structures released t
...
|
|
|
Variation data representation and exchange
Data using the VarioML standard can be integrated with the global library of purely genetic data. VarioML is a central prerequisite for effective modelling of phenotype data and genotype-to-phenotype relationships. It removes the obstacles to the eff
...
|
|
|
CUPP
Conserved Unique Peptide Patterns (CUPP) is a approach for sequence analysis employing conserved peptide patterns for determination of similarities between proteins. CUPP performs unsupervised clustering of proteins for formation of protein groups an
...
|
|
|
Animal Genome Size Database
A comprehensive catalogue of animal genome size data where haploid DNA contents (C-values, in picograms) are currently available for 4972 species (3231 vertebrates and 1741 non-vertebrates) based on 6518 records from 669 published sources.
|
|
|
Big Data Nucleic Acid Simulations Database
Atomistic Molecular Dynamics Simulation Trajectories and Analyses of Nucleic Acid Structures. BIGNASim is a complete platform to hold and analyse nucleic acids simulation data, based on two noSQL database engines: Cassandra to hold trajectory data an
...
|
|
|
GSDB
A database of 3D chromosome and genome structures reconstructed from Hi-C data.
Bioinformatics, Data Mining, Machine Learning (BDM) Laboratory,.
GSDB : Genome Structure Database.
|
|
|
Access to Biological Collection Data DNA extension
ABCDDNA is a theme specific extension for ABCD (Access to Biological Collections Data) created to facilitate storage and exchange of data related to DNA collection units, such as DNA extraction specifics, DNA quality parameters, and data characterisi
...
|
|
|
Multiple Alignment Format
The Multiple Alignment Format stores DNA level multiple alignments in an easily readable format between entire genomes. Unlike previous formats this resource can cope with forward and reverse strand directions, multiple pieces to the alignment, and s
...
|
|
|
NimbleGen Gene Description
The NimbleGen Gene Description is a text tabular format that contains descriptive information (annotation) about the genes on an array. Please note that Roche issued a statement in 2012 as follows: "As previously announced in June 2012, Roche has exi
...
|
|
|
Gene Transfer Format
The Gene transfer format (GTF) is a file format used to hold information about gene structure. It is a tab-delimited text format based on the general feature format (GFF), but contains some additional conventions specific to gene information. A signi
...
|
|
|
GenBank Sequence Format
GenBank Sequence Format (GenBank Flat File Format) consists of an annotation section and a sequence section. The start of the annotation section is marked by a line beginning with the word "LOCUS". The start of sequence section is marked by a line be
...
|
|
|
Wiggle Track Format
The wiggle (WIG) format is an older format for display of dense, continuous data such as GC percent, probability scores, and transcriptome data. The bigWig format is the recommended format for almost all graphing track needs. For speed and efficiency
...
|
|
|
Stockholm Multiple Alignment Format
The "Stockholm" format is a system for marking up features in a multiple alignment. These mark-up annotations are preceded by a 'magic' label, of which there are four types. The Stockholm format is used by HMMER, Pfam, and Belvu.
|
|
|
ENA Sequence Flat File Format
ENA Sequence Flat File Format is a standardised plain text format for nucleotide sequences. This format was previously called the EMBL Sequence Flat File Format.
|
|
|
New Hampshire eXtended Format
NHX is based on the New Hampshire (NH) standard (also called "Newick tree format").
|
|
|
Minimal Information for QTLs and Association Studies Tabular
The MIQAS set of rules accompanied with the standardized XML and tab-delimited file formats will serve two goals: to encourage research groups that wish to publish a QTL paper to provide and submit the necessary information that would make meta-analy
...
|
|
|
ENA Sequence XML Schema
ENA Sequence XML Schema is a standardised XML schema for nucleotide sequences. All assembled and annotated sequences must conform to this schema.
|
|
|
MitoPhen
The MitoPhen database is a human phenotype ontology-based approach to identify mitochondrial DNA diseases.
|
|
|
CNAdbCC
CNAdbCC is a curated database for copy number aberrations analysis and visualization of cervical cancer. Currently, the database contains about 1,000 dataset samples mainly integrated by affymetrix and aligent platform. Affymetrix is based on light-c
...
|
|
|
OncoDB
An interactive online database for analysis of gene expression and viral infection in cancer.
|
|
|
PANDIT
PANDIT is a collection of multiple sequence alignments and phylogenetic trees covering many common protein domains. It contains the seed protein sequence alignments from the Pfam-A (curated families) database; nucleotide sequence alignments derived f
...
|
|
|
Ontology for Genetic Interval
Using BFO (Basic Formal Ontology) as its upper-level ontology, the Ontology for Genetic Interval (OGI) represents gene as an entity with its 3D shape, topography, and primary DNA sequence as the foundation for its 3D structure. There is no official h
...
|
|
|
The European Bioinformatics Institute Tools
Data resources at the European Bioinformatics Institute (EMBL-EBI, https://www.ebi.ac.uk/) archive, organize and provide added-value analysis of research data produced around the world. This year's update for EMBL-EBI focuses on data exchanges among
...
|
|
|
PhytoPath
Genomics of fungal, oomycete and bacterial phytopathogens
|
|
|
RIKEN Bioresource Research Center
RIKEN BRC collects, preserves and distributes five important bioresources: experimental mouse strains, Arabidopsis thaliana and other laboratory plants, cultured cell lines of human and animal origin, microorganisms, genetic materials of human, anima
...
|
|
|
UniGene
<<<!!!<<< This repository is no longer available>>>!!!>>>. Although the web pages are no longer available, you will still be able to download the final UniGene builds as static content from the FTP site https://ftp.ncbi.nlm.nih.gov/repository/UniGen
...
|
|
|
UCL Infection DNA Bank
Collection of samples and data across the following diseases: Influenza virus (organism) The UCL Infection DNA Bank aims to facilitate research into infectious diseases through the enhanced availability of samples to researchers. This availability cu
...
|
|
|
St Thomas' Hospitals Plasma, serum & DNA Bio bank from patients with antiphospholipid antibodies
Collection of samples and data across the following diseases: Antiphospholipid syndrome (disorder) A frozen biobank collection of plasma, serum, and DNA from APS patients. Each sample is anonymised blood sample collected from patients who consent for
...
|
|
|
UK MND Collections
Collection of samples and data across the following diseases: Motor neuron disease (disorder) The UK MND Collections (formerly known as the UK MND DNA Bank) was established to provide the international research community with a resource that would he
...
|
|
|
GENOMICS ENGLAND 100K BIOINFORMATICS DATA
Contains tables with data related to genomic data and the outputs from the GEL interpretation pipeline data for participants from both cancer and rare disease programmes. These tables do not directly include primary + secondary sources of clinical da
...
|
|
|
UK primary Sjogren's syndrome Registry
Collection of samples and data across the following diseases: Sjogren's syndrome (disorder) Peripheral blood samples (DNA, RNA, serum, PBMC) from patients with primary Sjogren's syndrome, with detailed contemporaneous clinical data at the time of sam
...
|
|
|
NIHR BioResource: Sample holding
NIHR BioResource samples are held at the NIHR National Biosample Centre in Milton Keynes. Metadata on what is available should become available through the UK CRC Tissue Directory, as mandated by Research Tissue Bank status.
|
|
|
Hospital Episode Statistics Outpatients
Record-level patient data set of patients attending outpatient clinics at NHS hospitals in England. A record represents one appointment.
|
|
|
NIHR IBD BioResource: Sample holding
NIHR IBD BioResource samples are held at the NIHR National Biosample Centre in Milton Keynes. Metadata on what is available should become available through the UK CRC Tissue Directory, as mandated by Research Tissue Bank status.
|
|
|
National Cancer TRE
NHS Digital’s Trusted Research Environment service for England provides approved researchers with access to essential linked, de-identified health data to answer COVID-19 related questions. TRE service provides researchers support on data access requ
...
|
|
|
GENOMICS ENGLAND 100K QUICK VIEW
Data views that bring together data from several LabKey tables for convenient access Quickviews bring together data from several LabKey tables for convenient access, including:
rare_disease_analysis
Data for all rare disease participants including:
...
|
|
|
GENOMICS ENGLAND 100K RARE DISEASE & COMMON
Rare Disease (RD) data are presented at the level of RD families, RD pedigrees, and participants. Participants are consenting individuals who have had their genome sequenced. Pedigree members are extended members of the proband’s family. Rare Disease
...
|
|
|
GENOMICS ENGLAND 100K NHSD LINKED DATA
NHS national data sets collect information from care records, systems and organisations on specific areas of health and care. HES: Hospital Episode Statistics containing details of all commissioned activity during admissions, outpatient appointments
...
|
|
|
GENOMICS ENGLAND 100K PHE LINKED DATA
This dataset brings together data from more than 500 local and regional datasets to build a picture of an individual’s treatment from diagnosis. Available for patients diagnosed with Cancer (ICD10 C00-97, D00-48) from 1 January 1995 -31 December 2017
...
|
|
|
GENOMICS ENGLAND 100K CANCER & COMMON
Cancer data are presented for either the patient level cancer diagnosis or “disease type” or the tumour specific sample details of participants in the Cancer arm of the 100,000 Genomes Project.
Data Relating to Cancer Participants:
cancer_participa
...
|
|
|
eccDNAdb
A database of extrachromosomal circular DNA profiles in human cancers.
|
|
|
ASMdb
Allele-Specific DNA Methylation Databases (ASMdb) is a comprehensive database for allele-specific DNA methylation in diverse organisms. ASMdb is aiming to provide a comprehensive resource and a web tool for showing the DNA methylation level and diffe
...
|
|
|
Binary sequence information Format
A .2bit file stores multiple DNA sequences (up to 4 Gb total) in a compact randomly-accessible format. The file contains masking information as well as the DNA itself. The DNA sequence is represented as two bits per pixel with associated list of regi
...
|
|
|
EpiMOLAS
EpiMOLAS (Epi-genoMics OnLine Analysis System) is an intuitive web-based framework for genome-wide DNA methylation analysis.
|
|
|
CIS-BP
The Catalog of Inferred Sequence Binding Preferences (CIS-BP) is a library of transcription factor (TF) DNA binding motifs and specificities. The data are organized in a user friendly manner for ease of searching, browsing, and downloading. CIS-BP al
...
|
|
|
FinaleDB
FinaleDB (FragmentatIoN AnaLysis of cEll-free DNA DataBase) is a comprehensive cell-free DNA (cfDNA) fragmentation pattern database to host uniformly processed and quality controlled paired-end cfDNA WGS datasets.
|
|
|
Nanobase
A repository for DNA and RNA nanostructures.
|
|
|
Barcode of Life Data Systems
The Barcode of Life Data Systems (BOLD) provides DNA barcode data. BOLD's online workbench supports data validation, annotation, and publication for specimen, distributional, and molecular data. The platform consists of four main modules: a data port
...
|
|
|
NCBI PopSet
NCBI PopSet collects DNA sequences to analyze the ways that populations are related by evolution. Such sequences indicate if populations originate from different members of the same species or from organisms of different species entirely.
|
|
|
modENCODE
The modENCODE Project, Model Organism ENCyclopedia Of DNA Elements, was initiated by the funding of applications received in response to Requests for Applications (RFAs) HG-06-006, entitled Identification of All Functional Elements in Selected Model
...
|
|
|
.ACE format
The ACE file format is a specification for storing data about genomic contigs. The original ACE format was developed for use with Consed, a program for viewing, editing, and finishing DNA sequence assemblies. ACE files are generated by various assemb
...
|
|
|
Comparative genometrics CG database
Devoted to the characterization of complete chromosome sequences by standardized genometric methods, this website displays for sequenced genomes, three different genometric analyses: the DNA walk and the GC and TA skews during the initial phase. Alth
...
|
|
|
National Wild Seed Resource Center
The National Important Wild Plant Germplasm Repository has ten types of resources and data such as seeds, DNA, isolated materials, dried leaves, etc. totaling about 180,000 copies
|
|
|
NCBI GSS
>>>!!! GSS sequences are now being merged into the NCBI Nucleotide database !!!<<<
|
|
|
NCBI Trace Archive
The NCBI Trace Archive is a permanent repository of DNA sequence chromatograms (traces), base calls, and quality estimates for single-pass reads from various large-scale sequencing projects. The Trace Archive serves as the repository of sequencing da
...
|
|
|
ADmeth
A manually curated database for the differential methylation in Alzheimer's disease.
|
|
|
SkewDB
A free database of GC and many other skews for over 30,200 chromosomes and plasmids.
|
|
|
BpForms Grammar
The BpForms Grammar extends the IUPAC/IUBMB notation commonly used to represent unmodified DNA, RNA, and proteins to describe non-canonical forms of DNA, RNA, and proteins. Features include the representation of a wider range of monomeric forms, incl
...
|
|
|
ChIP-Seq Transcription Factor Data
We developed a method, ChIP-sequencing (ChIP-seq), combining chromatin immunoprecipitation (ChIP) and massively parallel sequencing to identify mammalian DNA sequences bound by transcription factors in vivo. We used ChIP-seq to map STAT1 targets in i
...
|
|
|
nucleotide inFormation binary Format
The .nib format pre-dates the .2bit format and is less compact. It describes a DNA sequence by packing two bases into each byte.
|
|
|
Viral Bioinformatics Resource Center
Databases of viral genomic information (genes, gene families, and genomes), and software to perform comparative genomics analyses
|
|
|
Stanford Microarray Database
>>>!!!<<< SMD has been retired.
After approximately fifteen years of microarray-centric research service, the Stanford Microarray Database has been retired. We apologize for any inconvenience; please read below for possible resolutions to your querie
...
|
|
|
NCBI Genome
The Genome database contains annotations and analysis of eukaryotic and prokaryotic genomes, as well as tools that allow users to compare genomes and gene sequences from humans, microbes, plants, viruses and organelles. Users can browse by organism,
...
|
|
|
PubMeth reviewed methylation database in cancer
An annotated and reviewed database of methylation in cancer. It is based on automated textmining of literature and is afterwards manually curated and annotated.
|
|
|
UNITE database
UNITE is a database and sequence management environment centered on the eukaryotic nuclear ribosomal ITS region. All eukaryotic ITS sequences from the International Nucleotide Sequence Database Collaboration are clustered to approximately the species
...
|
|
|
Estonian Biocentre Free Data
A small genotype data repository containing data used in recent papers from the Estonian Biocentre. Most of the data pertains to human population genetics. PDF files of the papers are also freely available.
|
|
|
JASPAR RESTful API
Widely used open-access database of curated, non-redundant transcription factor binding profiles. Currently, data from JASPAR can be retrieved as flat files or by using programming language-specific interfaces. Here, we present a programming language
...
|
|
|
Standard Flowgram Format
Standard flowgram format (SFF) is a binary file format used to encode results of pyrosequencing from the 454 Life Sciences platform for high-throughput sequencing. SFF files can be viewed, edited and converted with DNA Baser SFF Workbench (graphic to
...
|
|
|
The Cell Image Library
This library is a public and easily accessible resource database of images, videos, and animations of cells, capturing a wide diversity of organisms, cell types, and cellular processes. The Cell Image Library has been merged with "Cell Centered Datab
...
|
|
|
NAGRP Blast Center
NAGRP Blast Center aggregates various sequence databases and makes them accessible via its website.
|
|
|
Organelle Genome Megasequencing Program
The Organelle Genome Megasequencing Program (OGMP) provides mitochondrial, chloroplast, and mitochondrial plasmid genome data. OGMP tools allow direct comparison of OGMP and NCBI validated records. Includes GOBASE, a taxonomically broad organelle gen
...
|
|
|
Brain Transcriptome Database
The Brain Transcriptome Database (BrainTx) project aims to create an integrated platform to visualize and analyze our original transcriptome data and publicly accessible transcriptome data related to the genetics that underlie the development, functi
...
|
|
|
GeneWeaver.org
GeneWeaver combines cross-species data and gene entity integration, scalable hierarchical analysis of user data with a community-built and curated data archive of gene sets and gene networks, and tools for data driven comparison of user-defined biolo
...
|
|
|
European Xenopus Resource Center
The European Xenopus Resource Centre (EXRC) is situated in Portsmouth, United Kingdom and provides tools and services to support researchers using Xenopus models. The EXRC depends on researchers to obtain and deposit Xenopus transgenic and mutant lin
...
|
|
|
Broad-Novartis Cancer Cell Line Encyclopedia
The Cancer Cell Line Encyclopedia project is a collaboration between the Broad Institute, and the Novartis Institutes for Biomedical Research and its Genomics Institute of the Novartis Research Foundation to conduct a detailed genetic and pharmacolog
...
|
|
|
Canadian Epigenetics, Environment and Health Research Consortium Network
CEEHRC represents a multi-stage funding commitment by the Canadian Institutes of Health Research (CIHR) and multiple Canadian and international partners. The overall aim is to position Canada at the forefront of international efforts to translate new
...
|
|
|
IPK Gatersleben
The Institute of Plant Genetics and Crop Plant Research IPK Gatersleben, is a nonprofit research institution for crop genetics and molecular biology, and is part of the Leibniz Association. The mission of the IPK Gatersleben is to conduct basic and a
...
|
|
|
NCBI Clone DB
>>>!!! NCBI announced plans to retire the Clone DB web interface. Pursuant to this retirement, starting on May 27, 2019, all web pages associated with Clone DB and CloneFinder will redirect to this blog post. Links to Clone DB from the NCBI home page
...
|
|
|
Conservation Genome Resource Bank for Korean Wildlife
Genome resource samples of wild animals, particularly those of endangered mammalian and avian species, are very difficult to collect. In Korea, many of these animals such as tigers, leopards, bears, wolves, foxes, gorals, and river otters, are either
...
|
|
|
International HapMap Project
<<<!!!<<< OFFLINE >>>!!!>>>
A recent computer security audit has revealed security flaws in the legacy HapMap site that require NCBI to take it down immediately. We regret the inconvenience, but we are required to do this. That said, NCBI was plannin
...
|
|
|
Organelle Genome Resource
The organelle genomes are part of the NCBI Reference Sequence (RefSeq) project that provides curated sequence data and related information for the community to use as a standard.
|
|
|
Nematode Expression Pattern DataBase
The Kohara lab has been constructing an expression pattern map of the 100Mb genome of the nematode Caenorhabditis elegans through EST analysis and systematic whole mount in situ hybridization. NEXTDB is the database to integrate all information from
...
|
|
|
ENCODE peak information Format
The ENCODE peak information Format is used to provide called regions of signal enrichment based on pooled, normalized (interpreted) data.
|
|
|
Axt Alignment Format
Axt Alignment files are produced from Blastz, an alignment tool available from Webb Miller's lab at Penn State University. The axtNet and axtChain alignments are produced by processing the alignment files with additional utilities written by Jim Kent
...
|
|
|
Chain Format for pairwise alignment
The chain format describes a pairwise alignment that allow gaps in both sequences simultaneously. Each set of chain alignments starts with a header line, contains one or more alignment data lines, and terminates with a blank line. The format is delib
...
|
|
|
Genome Annotation File version 1
Annotation data is submitted to the GO Consortium in the form of gene association files, or GAFs. This standard lays out the format specification for GAF 1.0
|
|
|
net alignment annotation Format
The net file format is used to describe the axtNet data that underlie the net alignment annotations in the Genome Browser.
|
|
|
Gene Prediction File Format
Gene Prediction File Format (genePred) is a table format commonly used for gene prediction tracks in the Genome Browser. Variations of genePred include standard format, extended format and a format which includes RefSeq genes with gene names.
|
|
|
Personal Genome SNP Format
This format is for displaying SNPs from personal genomes. It is the same as is used for the Genome Variants and Population Variants tracks.
|
|
|
Genome Annotation File version 2
Annotation is the process of assigning GO terms to gene products. The annotation data in the GO database is contributed by members of the GO Consortium, and the Consortium actively encourages new groups to start contributing annotation. Annotation da
...
|
|
*ReputationScore indicates how established a given datasource is. Find out more.