Source | Match | ReputationScore* |
---|---|---|
GenBank
GenBank is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. The complete release notes for the current version of GenBank are available on the NCBI ftp site. A new release is made every two months. G
...
|
|
|
PRoteomics IDEntifications database
The PRIDE PRoteomics IDEntifications database is a centralized, standards compliant, public data repository that provides protein and peptide identifications together with supporting evidence.
|
|
|
Integrated resource of protein families, domains and functional sites
InterPro is a resource that provides functional analysis of protein sequences by classifying them into families and predicting the presence of domains and important sites. To classify proteins in this way, InterPro uses predictive models, known as si
...
|
|
|
Sequence Read Archive
The Sequence Read Archive (SRA) stores raw sequencing data from the next generation of sequencing platforms Data submitted to SRA. It is organized using a metadata model consisting of six objects: study, sample, experiment, run, analysis and submissi
...
|
|
|
Reference Sequence Database
The Reference Sequence (RefSeq) collection aims to provide a comprehensive, integrated, non-redundant, well-annotated set of sequences, including genomic DNA, transcripts, and proteins.
|
|
|
European Variation Archive
The European Variation Archive is an open-access archive that accepts submission of, and provides access to, all types of genetic variation data from all species. All users are able to download any dataset, or query our study catalogue via our variat
...
|
|
|
European Nucleotide Archive
The European Nucleotide Archive (ENA) is a globally comprehensive data resource for nucleotide sequence, spanning raw data, alignments and assemblies, functional and taxonomic annotation and rich contextual data relating to sequenced samples and expe
...
|
|
|
Insertion Sequence Finder
This database provides a list of insertion sequences (IS) isolated from bacteria and archae. It is organized into individual files containing their general features (name, size, origin, family.....) as well as their DNA and potential protein sequence
...
|
|
|
PCR Primer Database for Gene Expression Detection and Quantification
PrimerBank is a public resource for PCR primers. These primers are designed for gene expression detection or quantification (real-time PCR). PrimerBank contains over 306,800 primers covering most known human and mouse genes. There are several ways to
...
|
|
|
PomBase
PomBase is a model organism database that provides organization of and access to scientific data for the fission yeast Schizosaccharomyces pombe. PomBase supports genomic sequence and features, genome-wide datasets and manual literature curation as w
...
|
|
|
BioSamples at the European Bioinformatics Institute
The BioSamples database aggregates sample information for reference samples (e.g. Coriell Cell lines) and samples for which data exist in one of the EBI's assay databases such as ArrayExpress, the European Nucleotide Archive or PRIDE. It provides lin
...
|
|
|
Stanford HIV Drug Resistance Database
The Stanford HIV Drug Resistance Database (HIVDB) is an essential resource for public health officials monitoring ADR and TDR, for scientists developing new ARV drugs, and for HIV care providers managing patients with HIVDR.
|
|
|
Sol Genomics Network
The Sol Genomics Network (SGN) is a database and website dedicated to the genomic information of the Solanaceae family, which includes species such as tomato, potato, pepper, petunia and eggplant.
|
|
|
CottonGen
CottonGen is a cotton community genomics, genetics and breeding database being developed to enable basic, translational and applied research in cotton. It is being built using the open-source Tripal database infrastructure. CottonGen supercedes Cotto
...
|
|
|
Japan Proteome Standard Repository
jPOSTrepo (Japan ProteOme STandard Repository) is a data repository of sharing MS raw/processed data.
|
|
|
Mammalian Gene Collection
Overview The NIH Mammalian Gene Collection (MGC) program is a multi-institutional effort to identify and sequence cDNA clones containing a full-length open reading frame (FL-ORF) for human, mouse, and rat genes. To date, the MGC has produced over 324
...
|
|
|
DNA Data Bank of Japan
An annotated collection of all publicly available nucleotide and protein sequences. DDBJ collects sequence data mainly from Japanese researchers, as well as researchers in other countries. DDBJ is part of the International Nucleotide Sequence Databas
...
|
|
|
Sequencing Initiative Suomi
The Sequencing Initiative Suomi (SISu) search engine offers a way to search for data on sequence variants in the Finnish population. It provides valuable summary data for researchers and clinicians as well as other researchers with an interest in gen
...
|
|
|
Giga Science Database
GigaDB primarily serves as a repository to host data and tools associated with articles in GigaScience; however, it also includes a subset of datasets that are not associated with GigaScience articles. GigaDB defines a dataset as a group of files (e.
...
|
|
|
BIG Data Center
The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of multi-omics data
...
|
|
|
NCBI BioProject
A BioProject is a collection of biological data related to a single initiative, originating from a single organization or from a consortium. A BioProject record provides users a single place to find links to the diverse data types generated for that
...
|
|
|
BioProject XML Schema
This is a XML Schema specification of BioProject data. A BioProject is a collection of biological data related to a single initiative, originating from a single organization or from a consortium. A BioProject record provides users a single place to f
...
|
|
|
The Chromosome 7 Annotation Project
The objective of this project is to generate the most comprehensive description of human chromosome 7 to facilitate biological discovery, disease gene research and medical genetic applications.
|
|
|
CODEX
ChIP-Seq, RNA-Seq and DNase-Seq data for haematopoietic and embryonic stem cells
|
|
|
CMR
The Comprehensive Microbial Resource (CMR) gives access to a central repository of the sequence and annotation of all complete public prokaryotic genomes as well as comparative genomics tools across all of the genomes in the database.
|
|
|
ViruSurf
ViruSurf is a large public database of viral sequences and integrated and curated metadata from heterogeneous sources (RefSeq, GenBank, COG-UK and NMDC); it also exposes computed nucleotide and amino acid variants, called from original sequences. A G
...
|
|
|
Minimal information about Adaptive Immune Receptor Repertoire
Minimal information about Adaptive Immune Receptor Repertoire (MiAIRR) is a checklist of minimally required information that we recommend journals adopt, and that could form the requirements for submission to a public data repository. AIRR sequencing
...
|
|
|
ChimerDB
ChimerDB is a database of fusion sequences encompassing bioinformatics analysis of mRNA and EST sequences in the GenBank, manual collection of literature data and integration with other well known databases. Fusion transcripts with nonoverlapping ali
...
|
|
|
piRBase
piRBase stores information on piRNAs and piRNA-associated data to support piRNA functional analysis.
|
|
|
SCPortalen
SCPortalen is a single-cell database created to facilitate and enable researchers to access and explore published single-cell datasets. It integrates human and mouse single-cell transcriptomics datasets, single-cell metadata, cell images and sequence
...
|
|
|
EnhancerAtlas 2.0
Enhancers are a class of cis-regulatory elements that can increase gene transcription by forming loops in intergenic regions, introns and exons. Enhancers, as well as their associated target genes, and transcription factors (TFs) that bind to them, a
...
|
|
|
Immune Tolerance Network TrialShare
The immune tolerance data management and visualization portal for studies sponsored by the Immune Tolerance Network (ITN) and collaborating investigators. Data from published studies are accessible to any user; data from current in-progress studies a
...
|
|
|
GlycoPOST
GlycoPOST is a mass spectrometry data repository for glycomics. Users can release their "raw/processed" data via this site with a unique identifier number for the paper publication. Submission conditions are in accordance with the Minimum Information
...
|
|
|
Resource of Asian Primary Immunodeficiency Diseases
The Resource of Asian Primary Immunodeficiency Diseases (RAPID) is a repository of molecular alterations in primary immunodeficiency diseases (PID). It hosts information on sequence variations and expression at the mRNA and protein levels of all gene
...
|
|
|
NGSmethDB
Next-generation sequencing single-cytosine-resolution DNA methylation data
|
|
|
DeepHF
Optimized CRISPR guide RNA design for two high-fidelity Cas9 variants by deep learning | Core code for the DeepHF prediction tool | SpCas9 & Base Editor Efficiency Prediction | This tool provides guide designs for Wild-type SpCas9, two highly specifi
...
|
|
|
Nematodes.org
Wiki for coordinating nematode sequencing projects
|
|
|
RSSsite
Reference database and prediction tool for the identification of cryptic recombination signal sequences (RSSs) in the human and mouse genomes.
|
|
|
JMorp
Japanese Multi Omics Reference Panel
|
|
|
MaveDB
An open-source platform to distribute and interpret data from multiplexed assays of variant effect.
Table of Multiplexed Assay of Variant Effect (MAVE) studies.
MaveDB - A repository for MAVE assay datasets.
To cite this document, please use the c
...
|
|
|
TIARA - Total Integrated Archive of short-Read and Array
The Total Integrated Archive of short-Read and Array (TIARA) accumulates raw-level personal genomic data from whole genome next-generation sequencing (NGS) and comparative genomic hybridization (CGH) arrays. Initially, it contains 36 individual genom
...
|
|
|
AlgaePath
Comprehensive analysis of metabolic pathways using transcript abundance data from next-generation sequencing in green algae.
|
|
|
Yersinia
Genus-wide Yersinia core-genome multilocus sequence typing for species identification and strain characterization.
|
|
|
VariCarta
A Comprehensive Database of Harmonized Genomic Variants Found in Autism Spectrum Disorder Sequencing Studies.
VariCarta is a curated, web-based database housing ASD-linked genes created from the meta-analysis of -omic sequencing literature.
VariCar
...
|
|
|
FORK-seq
FORK-seq is a replication landscape of the Saccharomyces cerevisiae genome by nanopore sequencing
|
|
|
Clinical NGS DB
Tool for the Unified Management of Clinical Information and Genetic Variants to Accelerate Variant Pathogenicity Classification.
|
|
|
BEable-GPS
BEable-GPS: Base Editable prediction of Global Pathogenic-related SNVs. Comparison of cytosine base editors and development of the BEable-GPS database for targeting pathogenic SNVs.
|
|
|
Diat.barcode
An open-access curated barcode library for diatoms.
Diatoms (Bacillariophyta) are ubiquitous microalgae which produce a siliceous exoskeleton and which make a major contribution to the productivity of oceans and freshwaters. They display a huge dive
...
|
|
|
CZEUM
The Collection of Zoosporic Eufungi at the University of Michigan (CZEUM) is a database of barcoded Chytridiomyceta and Blastocladiomycota cultures.
|
|
|
nanobodies
INDI-integrated nanobody database for immunoinformatics.
|
|
|
dbAMP
DBAMP 2.0: updated resource for antimicrobial peptides with an enhanced scanning method on genomic and proteomic data.
|
|
|
AcetoBase
AcetoBase is a dedicated repository and curated database for the analysis of acetogenic bacteria based on the key functional gene formyltetrahydrofolate synthetase (FTHFS/fhs) of Wood-Ljungdahl Pathway for Acetogenesis.
|
|
|
Animal Genome Size Database
A comprehensive catalogue of animal genome size data where haploid DNA contents (C-values, in picograms) are currently available for 4972 species (3231 vertebrates and 1741 non-vertebrates) based on 6518 records from 669 published sources.
|
|
|
GEAR-base
GEnetic Antibiotic Resistance and Susceptibility Database.
|
|
|
dbMMR-Chinese
Variants of DNA mismatch repair genes derived from 33,998 Chinese individuals with and without cancer reveal their highly ethnic-specific nature.
An open-access database of DNA mismatch repair (MMR) gene variants in Chinese population.
DNA mismatch
...
|
|
|
HDAM
A resource of human disease associated mutations from next generation sequencing studies.
|
|
|
SEQdata-BEACON
SEQdata-BEACON is a comprehensive database of sequencing performance and statistical tools for performance evaluation and yield simulation in BGISEQ-500.
|
|
|
ImtRDB
Database and software for mitochondrial imperfect interspersed repeats annotation.
|
|
|
ASRD
An online database for exploring over 2,000 Arabidopsis small RNA libraries.
|
|
|
pr2-primers
A database of eukaryotic rRNA primers and primer sets for metabarcoding studies compiled from the literature.
|
|
|
NoBadWordsCombiner
Protocol for using NoBadWordsCombiner to merge and minimize "bad words" from BLAST hits against multiple eukaryotic gene annotation databases.
|
|
|
anti-CRISPRdb
Anti-CRISPRdb is a comprehensive online resource that effectively organizes anti-CRISPR
proteins determined by experimental and bioinformatics methods. Additionally, it also provides
nucleotide sequences, interactors, three-dimensional structures,
...
|
|
|
NIHR BioResource: Whole Genome Sequencing
The NIHR BioResource ran the pilot for GEL's 100,000 Genomes Project. Most of the participants with rare disease were recruited on the basis of having no known diagnosis, and have had extensive work up on WGS data, including reporting to the clinical
...
|
|
|
Wellcome Sanger Institute: Whole Exome Sequencing
There is a substantial overlap between the NIHR IBD BioResource and the IBD UK Genetics Consortium (IBDGC). The NIHR BioResource provides some DNA samples. IBDGC data is being provided by the Wellcome Sanger Institute, who are performing the sequenci
...
|
|
|
GENOMICS ENGLAND 100K BIOINFORMATICS DATA
Contains tables with data related to genomic data and the outputs from the GEL interpretation pipeline data for participants from both cancer and rare disease programmes. These tables do not directly include primary + secondary sources of clinical da
...
|
|
|
GENOMICS ENGLAND 100K CANCER & COMMON
Cancer data are presented for either the patient level cancer diagnosis or “disease type” or the tumour specific sample details of participants in the Cancer arm of the 100,000 Genomes Project.
Data Relating to Cancer Participants:
cancer_participa
...
|
|
|
OpenContami
A web-based application for detecting microbial contaminants in next-generation sequencing data. OpenContami: Open Cell Microbial Contaminants by High-throughput Sequencing.
|
|
|
DNMSO
DNMSO is an ontology for representing de novo sequencing results from Tandem-MS data. For the identification and sequencing of proteins, mass spectrometry (MS) has become the tool of choice and as such drives proteomics.
|
|
|
Bovine Genome Variation Database (BGVD)
An integrated Web-database for bovine sequencing variations and selective signatures.
|
|
|
ChIP-Seq Transcription Factor Data
We developed a method, ChIP-sequencing (ChIP-seq), combining chromatin immunoprecipitation (ChIP) and massively parallel sequencing to identify mammalian DNA sequences bound by transcription factors in vivo. We used ChIP-seq to map STAT1 targets in i
...
|
|
|
SEAR: Search Engine for Antimicrobial Resistance
Construct full-length, horizontally acquired Antibiotic Resistance Genes (ARGs) from sequencing datasets. It has been designed with environmental metagenomics and microbiome experiments in mind, where the diversity and relative abundance of ARGs need
...
|
|
|
PhytoTypeDB
Database of plant protein inter-cultivar variability and function.
|
|
|
HeveaDB
A genetic resource database for rubber tree genomic study | Molecular & Genetic Resources for Hevea tree
|
|
|
sRNAanno
a database repository of uniformly-annotated small RNAs in plants | Abstract Small RNAs (sRNAs) are essential regulatory molecules, including three mayor classes in plants, microRNAs (miRNAs), phased small interfering RNAs (phased siRNAs or phasiRNAs
...
|
|
|
Nanobase
A repository for DNA and RNA nanostructures.
|
|
|
ORSO
A data-driven social network connecting scientists to genomics datasets.
ORSO (Online Resource for Social Omics) is a web application designed to help users find next generation sequencing (NGS) datasets relevant to their research interests. ORSO per
...
|
|
|
Gene4HL
An Integrated Genetic Database for Hearing Loss.
|
|
|
RGEN
Computational tools and libraries for CRISPR/Cas9-derived RNA-guided engineered nucleases (RGENs).
|
|
|
GESS v2
Advanced Functions Embedded in the Second Version of Database, Global Evaluation of SARS-CoV-2/hCoV-19 Sequences 2.
|
|
|
CohesinDB
A comprehensive database for decoding cohesin-related epigenomes, 3D genomes and transcriptomes in human cells.
|
|
|
Gowinda
Gowinda: unbiased analysis of gene set enrichment for Genome Wide Association Studies
|
|
|
MitoLink
A generic integrated web-based workflow system to evaluate genotype-phenotype correlations in human mitochondrial diseases.
|
|
|
REVA
REVA as a Well-curated Database for Human Expression-modulating Variants.
|
|
|
CoxBase
CoxBase is an online platform for epidemiological surveillance, visualization, analysis and typing of Coxiella burnetii genomic sequence.
|
|
|
EpiMOLAS
EpiMOLAS (Epi-genoMics OnLine Analysis System) is an intuitive web-based framework for genome-wide DNA methylation analysis.
|
|
|
NexGenEx-Tom
Gene Expression platform to investigate gene expression and functionalities in the tomato genome. It includes expression data from cultivated specie/variety Heinz 1706, Ailsa Craig e Solanum pimpinellifolium.
|
|
|
EUAdb
EUAdb is database for COVID-19 test development that contains standardized information about Eemergency Use Authorizations-issued tests and is focused on RT-qPCR diagnostic tests, or high complexity molecular-based laboratory developed tests.
|
|
|
BDdb
BDdb is a comprehensive database associated with birth-defect-related diseases. It consists of multi-omics datasets involving tens of common birth-defect diseases, and BDdb supplements more than 2000 biomarkers belonging to 22 types of birth defects.
|
|
|
DDBJ BioProject
The DDBJ BioProject resource organizes both the projects and the data from those projects which is deposited into several archival databases maintained by members of the INSDC. This allows searching by characteristics of these projects, using the pro
...
|
|
|
TEx-MST
TEx-MST is a novel bioinformatic database for providing the valuable expression information of MANE-select transcripts in normal human tissues.
|
|
|
PlantRep
Plant Repeat Database (Plantre) provides re-annotated repeat sequences of plant using a uniform pipeline. The current version of plantrep contains 206.04Gb of 396,041,410 repeats from 459 species that were divided into 15 clades based on their phylog
...
|
|
|
CSI NGS Portal
An Online Platform for Automated NGS Data Analysis and Sharing.
CSI NGS Portal is an online platform for fully automated NGS data analysis and sharing .
CSI NGS Portal uses a single, randomly generated, persistent, secure and http-only browser cookie
...
|
|
|
National Wild Seed Resource Center
The National Important Wild Plant Germplasm Repository has ten types of resources and data such as seeds, DNA, isolated materials, dried leaves, etc. totaling about 180,000 copies
|
|
|
CANT-HYD
Calgary approach to ANnoTating HYDrocarbon degradation genes (CANT-HYD), a database of 37 HMMs of marker genes involved in anaerobic and aerobic degradation pathways of aliphatic and aromatic hydrocarbons.
|
|
|
REPIC
A database for exploring N6-methyladenosine methylome.
REPIC (RNA Epitranscriptome Collection) is a database dedicated to provide a new resource to investigate potential functions and mechanisms of N6-adenosine methylation (m6A) modifications. Curre
...
|
|
|
Physical mapping data at Canada's Michael Smith Genome Sciences Centre - Data
FPC Mapping data files from species that have been fingerprinted at Canada's Michael Smith Genome Sciences Centre (BCGSC).
|
|
|
Combined QTL Map of Dairy Cattle Traits
>>>!!! <<< 2021-09-01: repository is offline >>>!!!<<< Background: Many studies have been conducted to detect quantitative trait loci (QTL) in dairy cattle. However, these studies are diverse in terms of their differing resource populations, marker
...
|
|
*ReputationScore indicates how established a given datasource is. Find out more.