Tag: sequencing

Found 97 sources

Source	Match	ReputationScore*
GenBank GenBank is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. The complete release notes for the current version of GenBank are available on the NCBI ftp site. A new release is made every two months. G ...		100%
PRoteomics IDEntifications database The PRIDE PRoteomics IDEntifications database is a centralized, standards compliant, public data repository that provides protein and peptide identifications together with supporting evidence.		99%
Integrated resource of protein families, domains and functional sites InterPro is a resource that provides functional analysis of protein sequences by classifying them into families and predicting the presence of domains and important sites. To classify proteins in this way, InterPro uses predictive models, known as si ...		92%
Sequence Read Archive The Sequence Read Archive (SRA) stores raw sequencing data from the next generation of sequencing platforms Data submitted to SRA. It is organized using a metadata model consisting of six objects: study, sample, experiment, run, analysis and submissi ...		92%
Reference Sequence Database The Reference Sequence (RefSeq) collection aims to provide a comprehensive, integrated, non-redundant, well-annotated set of sequences, including genomic DNA, transcripts, and proteins.		83%
European Variation Archive The European Variation Archive is an open-access archive that accepts submission of, and provides access to, all types of genetic variation data from all species. All users are able to download any dataset, or query our study catalogue via our variat ...		76%
European Nucleotide Archive The European Nucleotide Archive (ENA) is a globally comprehensive data resource for nucleotide sequence, spanning raw data, alignments and assemblies, functional and taxonomic annotation and rich contextual data relating to sequenced samples and expe ...		72%
Insertion Sequence Finder This database provides a list of insertion sequences (IS) isolated from bacteria and archae. It is organized into individual files containing their general features (name, size, origin, family.....) as well as their DNA and potential protein sequence ...		64%
PCR Primer Database for Gene Expression Detection and Quantification PrimerBank is a public resource for PCR primers. These primers are designed for gene expression detection or quantification (real-time PCR). PrimerBank contains over 306,800 primers covering most known human and mouse genes. There are several ways to ...		62%
PomBase PomBase is a model organism database that provides organization of and access to scientific data for the fission yeast Schizosaccharomyces pombe. PomBase supports genomic sequence and features, genome-wide datasets and manual literature curation as w ...		61%
BioSamples at the European Bioinformatics Institute The BioSamples database aggregates sample information for reference samples (e.g. Coriell Cell lines) and samples for which data exist in one of the EBI's assay databases such as ArrayExpress, the European Nucleotide Archive or PRIDE. It provides lin ...		60%
Stanford HIV Drug Resistance Database The Stanford HIV Drug Resistance Database (HIVDB) is an essential resource for public health officials monitoring ADR and TDR, for scientists developing new ARV drugs, and for HIV care providers managing patients with HIVDR.		60%
Sol Genomics Network The Sol Genomics Network (SGN) is a database and website dedicated to the genomic information of the Solanaceae family, which includes species such as tomato, potato, pepper, petunia and eggplant.		58%
CottonGen CottonGen is a cotton community genomics, genetics and breeding database being developed to enable basic, translational and applied research in cotton. It is being built using the open-source Tripal database infrastructure. CottonGen supercedes Cotto ...		58%
Japan Proteome Standard Repository jPOSTrepo (Japan ProteOme STandard Repository) is a data repository of sharing MS raw/processed data.		56%
Mammalian Gene Collection Overview The NIH Mammalian Gene Collection (MGC) program is a multi-institutional effort to identify and sequence cDNA clones containing a full-length open reading frame (FL-ORF) for human, mouse, and rat genes. To date, the MGC has produced over 324 ...		55%
DNA Data Bank of Japan An annotated collection of all publicly available nucleotide and protein sequences. DDBJ collects sequence data mainly from Japanese researchers, as well as researchers in other countries. DDBJ is part of the International Nucleotide Sequence Databas ...		55%
Sequencing Initiative Suomi The Sequencing Initiative Suomi (SISu) search engine offers a way to search for data on sequence variants in the Finnish population. It provides valuable summary data for researchers and clinicians as well as other researchers with an interest in gen ...		55%
Giga Science Database GigaDB primarily serves as a repository to host data and tools associated with articles in GigaScience; however, it also includes a subset of datasets that are not associated with GigaScience articles. GigaDB defines a dataset as a group of files (e. ...		54%
BIG Data Center The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of multi-omics data ...		54%
NCBI BioProject A BioProject is a collection of biological data related to a single initiative, originating from a single organization or from a consortium. A BioProject record provides users a single place to find links to the diverse data types generated for that ...		48%
BioProject XML Schema This is a XML Schema specification of BioProject data. A BioProject is a collection of biological data related to a single initiative, originating from a single organization or from a consortium. A BioProject record provides users a single place to f ...		48%
The Chromosome 7 Annotation Project The objective of this project is to generate the most comprehensive description of human chromosome 7 to facilitate biological discovery, disease gene research and medical genetic applications.		47%
CODEX ChIP-Seq, RNA-Seq and DNase-Seq data for haematopoietic and embryonic stem cells		45%
CMR The Comprehensive Microbial Resource (CMR) gives access to a central repository of the sequence and annotation of all complete public prokaryotic genomes as well as comparative genomics tools across all of the genomes in the database.		45%
ViruSurf ViruSurf is a large public database of viral sequences and integrated and curated metadata from heterogeneous sources (RefSeq, GenBank, COG-UK and NMDC); it also exposes computed nucleotide and amino acid variants, called from original sequences. A G ...		45%
Minimal information about Adaptive Immune Receptor Repertoire Minimal information about Adaptive Immune Receptor Repertoire (MiAIRR) is a checklist of minimally required information that we recommend journals adopt, and that could form the requirements for submission to a public data repository. AIRR sequencing ...		44%
ChimerDB ChimerDB is a database of fusion sequences encompassing bioinformatics analysis of mRNA and EST sequences in the GenBank, manual collection of literature data and integration with other well known databases. Fusion transcripts with nonoverlapping ali ...		44%
piRBase piRBase stores information on piRNAs and piRNA-associated data to support piRNA functional analysis.		43%
SCPortalen SCPortalen is a single-cell database created to facilitate and enable researchers to access and explore published single-cell datasets. It integrates human and mouse single-cell transcriptomics datasets, single-cell metadata, cell images and sequence ...		42%
EnhancerAtlas 2.0 Enhancers are a class of cis-regulatory elements that can increase gene transcription by forming loops in intergenic regions, introns and exons. Enhancers, as well as their associated target genes, and transcription factors (TFs) that bind to them, a ...		42%
Immune Tolerance Network TrialShare The immune tolerance data management and visualization portal for studies sponsored by the Immune Tolerance Network (ITN) and collaborating investigators. Data from published studies are accessible to any user; data from current in-progress studies a ...		42%
GlycoPOST GlycoPOST is a mass spectrometry data repository for glycomics. Users can release their "raw/processed" data via this site with a unique identifier number for the paper publication. Submission conditions are in accordance with the Minimum Information ...		41%
Resource of Asian Primary Immunodeficiency Diseases The Resource of Asian Primary Immunodeficiency Diseases (RAPID) is a repository of molecular alterations in primary immunodeficiency diseases (PID). It hosts information on sequence variations and expression at the mRNA and protein levels of all gene ...		41%
NGSmethDB Next-generation sequencing single-cytosine-resolution DNA methylation data		40%
DeepHF Optimized CRISPR guide RNA design for two high-fidelity Cas9 variants by deep learning \| Core code for the DeepHF prediction tool \| SpCas9 & Base Editor Efficiency Prediction \| This tool provides guide designs for Wild-type SpCas9, two highly specifi ...		40%
Nematodes.org Wiki for coordinating nematode sequencing projects		39%
RSSsite Reference database and prediction tool for the identification of cryptic recombination signal sequences (RSSs) in the human and mouse genomes.		38%
JMorp Japanese Multi Omics Reference Panel		38%
MaveDB An open-source platform to distribute and interpret data from multiplexed assays of variant effect. Table of Multiplexed Assay of Variant Effect (MAVE) studies. MaveDB - A repository for MAVE assay datasets. To cite this document, please use the c ...		38%
TIARA - Total Integrated Archive of short-Read and Array The Total Integrated Archive of short-Read and Array (TIARA) accumulates raw-level personal genomic data from whole genome next-generation sequencing (NGS) and comparative genomic hybridization (CGH) arrays. Initially, it contains 36 individual genom ...		38%
AlgaePath Comprehensive analysis of metabolic pathways using transcript abundance data from next-generation sequencing in green algae.		36%
Yersinia Genus-wide Yersinia core-genome multilocus sequence typing for species identification and strain characterization.		36%
VariCarta A Comprehensive Database of Harmonized Genomic Variants Found in Autism Spectrum Disorder Sequencing Studies. VariCarta is a curated, web-based database housing ASD-linked genes created from the meta-analysis of -omic sequencing literature. VariCar ...		35%
FORK-seq FORK-seq is a replication landscape of the Saccharomyces cerevisiae genome by nanopore sequencing		35%
Clinical NGS DB Tool for the Unified Management of Clinical Information and Genetic Variants to Accelerate Variant Pathogenicity Classification.		35%
BEable-GPS BEable-GPS: Base Editable prediction of Global Pathogenic-related SNVs. Comparison of cytosine base editors and development of the BEable-GPS database for targeting pathogenic SNVs.		35%
Diat.barcode An open-access curated barcode library for diatoms. Diatoms (Bacillariophyta) are ubiquitous microalgae which produce a siliceous exoskeleton and which make a major contribution to the productivity of oceans and freshwaters. They display a huge dive ...		35%
CZEUM The Collection of Zoosporic Eufungi at the University of Michigan (CZEUM) is a database of barcoded Chytridiomyceta and Blastocladiomycota cultures.		35%
nanobodies INDI-integrated nanobody database for immunoinformatics.		34%
dbAMP DBAMP 2.0: updated resource for antimicrobial peptides with an enhanced scanning method on genomic and proteomic data.		34%
AcetoBase AcetoBase is a dedicated repository and curated database for the analysis of acetogenic bacteria based on the key functional gene formyltetrahydrofolate synthetase (FTHFS/fhs) of Wood-Ljungdahl Pathway for Acetogenesis.		34%
Animal Genome Size Database A comprehensive catalogue of animal genome size data where haploid DNA contents (C-values, in picograms) are currently available for 4972 species (3231 vertebrates and 1741 non-vertebrates) based on 6518 records from 669 published sources.		34%
GEAR-base GEnetic Antibiotic Resistance and Susceptibility Database.		34%
dbMMR-Chinese Variants of DNA mismatch repair genes derived from 33,998 Chinese individuals with and without cancer reveal their highly ethnic-specific nature. An open-access database of DNA mismatch repair (MMR) gene variants in Chinese population. DNA mismatch ...		34%
HDAM A resource of human disease associated mutations from next generation sequencing studies.		34%
SEQdata-BEACON SEQdata-BEACON is a comprehensive database of sequencing performance and statistical tools for performance evaluation and yield simulation in BGISEQ-500.		34%
ImtRDB Database and software for mitochondrial imperfect interspersed repeats annotation.		34%
ASRD An online database for exploring over 2,000 Arabidopsis small RNA libraries.		33%
pr2-primers A database of eukaryotic rRNA primers and primer sets for metabarcoding studies compiled from the literature.		33%
NoBadWordsCombiner Protocol for using NoBadWordsCombiner to merge and minimize "bad words" from BLAST hits against multiple eukaryotic gene annotation databases.		33%
anti-CRISPRdb Anti-CRISPRdb is a comprehensive online resource that effectively organizes anti-CRISPR proteins determined by experimental and bioinformatics methods. Additionally, it also provides nucleotide sequences, interactors, three-dimensional structures, ...		32%
NIHR BioResource: Whole Genome Sequencing The NIHR BioResource ran the pilot for GEL's 100,000 Genomes Project. Most of the participants with rare disease were recruited on the basis of having no known diagnosis, and have had extensive work up on WGS data, including reporting to the clinical ...		31%
Wellcome Sanger Institute: Whole Exome Sequencing There is a substantial overlap between the NIHR IBD BioResource and the IBD UK Genetics Consortium (IBDGC). The NIHR BioResource provides some DNA samples. IBDGC data is being provided by the Wellcome Sanger Institute, who are performing the sequenci ...		31%
GENOMICS ENGLAND 100K BIOINFORMATICS DATA Contains tables with data related to genomic data and the outputs from the GEL interpretation pipeline data for participants from both cancer and rare disease programmes. These tables do not directly include primary + secondary sources of clinical da ...		31%
GENOMICS ENGLAND 100K CANCER & COMMON Cancer data are presented for either the patient level cancer diagnosis or “disease type” or the tumour specific sample details of participants in the Cancer arm of the 100,000 Genomes Project. Data Relating to Cancer Participants: cancer_participa ...		31%
OpenContami A web-based application for detecting microbial contaminants in next-generation sequencing data. OpenContami: Open Cell Microbial Contaminants by High-throughput Sequencing.		30%
DNMSO DNMSO is an ontology for representing de novo sequencing results from Tandem-MS data. For the identification and sequencing of proteins, mass spectrometry (MS) has become the tool of choice and as such drives proteomics.		30%
Bovine Genome Variation Database (BGVD) An integrated Web-database for bovine sequencing variations and selective signatures.		30%
ChIP-Seq Transcription Factor Data We developed a method, ChIP-sequencing (ChIP-seq), combining chromatin immunoprecipitation (ChIP) and massively parallel sequencing to identify mammalian DNA sequences bound by transcription factors in vivo. We used ChIP-seq to map STAT1 targets in i ...		30%
SEAR: Search Engine for Antimicrobial Resistance Construct full-length, horizontally acquired Antibiotic Resistance Genes (ARGs) from sequencing datasets. It has been designed with environmental metagenomics and microbiome experiments in mind, where the diversity and relative abundance of ARGs need ...		30%
PhytoTypeDB Database of plant protein inter-cultivar variability and function.		30%
HeveaDB A genetic resource database for rubber tree genomic study \| Molecular & Genetic Resources for Hevea tree		30%
sRNAanno a database repository of uniformly-annotated small RNAs in plants \| Abstract Small RNAs (sRNAs) are essential regulatory molecules, including three mayor classes in plants, microRNAs (miRNAs), phased small interfering RNAs (phased siRNAs or phasiRNAs ...		30%
Nanobase A repository for DNA and RNA nanostructures.		30%
ORSO A data-driven social network connecting scientists to genomics datasets. ORSO (Online Resource for Social Omics) is a web application designed to help users find next generation sequencing (NGS) datasets relevant to their research interests. ORSO per ...		30%
Gene4HL An Integrated Genetic Database for Hearing Loss.		30%
RGEN Computational tools and libraries for CRISPR/Cas9-derived RNA-guided engineered nucleases (RGENs).		30%
GESS v2 Advanced Functions Embedded in the Second Version of Database, Global Evaluation of SARS-CoV-2/hCoV-19 Sequences 2.		30%
CohesinDB A comprehensive database for decoding cohesin-related epigenomes, 3D genomes and transcriptomes in human cells.		30%
Gowinda Gowinda: unbiased analysis of gene set enrichment for Genome Wide Association Studies		30%
MitoLink A generic integrated web-based workflow system to evaluate genotype-phenotype correlations in human mitochondrial diseases.		30%
REVA REVA as a Well-curated Database for Human Expression-modulating Variants.		30%
CoxBase CoxBase is an online platform for epidemiological surveillance, visualization, analysis and typing of Coxiella burnetii genomic sequence.		30%
EpiMOLAS EpiMOLAS (Epi-genoMics OnLine Analysis System) is an intuitive web-based framework for genome-wide DNA methylation analysis.		30%
NexGenEx-Tom Gene Expression platform to investigate gene expression and functionalities in the tomato genome. It includes expression data from cultivated specie/variety Heinz 1706, Ailsa Craig e Solanum pimpinellifolium.		30%
EUAdb EUAdb is database for COVID-19 test development that contains standardized information about Eemergency Use Authorizations-issued tests and is focused on RT-qPCR diagnostic tests, or high complexity molecular-based laboratory developed tests.		30%
BDdb BDdb is a comprehensive database associated with birth-defect-related diseases. It consists of multi-omics datasets involving tens of common birth-defect diseases, and BDdb supplements more than 2000 biomarkers belonging to 22 types of birth defects.		30%
DDBJ BioProject The DDBJ BioProject resource organizes both the projects and the data from those projects which is deposited into several archival databases maintained by members of the INSDC. This allows searching by characteristics of these projects, using the pro ...		30%
TEx-MST TEx-MST is a novel bioinformatic database for providing the valuable expression information of MANE-select transcripts in normal human tissues.		30%
PlantRep Plant Repeat Database (Plantre) provides re-annotated repeat sequences of plant using a uniform pipeline. The current version of plantrep contains 206.04Gb of 396,041,410 repeats from 459 species that were divided into 15 clades based on their phylog ...		30%
CSI NGS Portal An Online Platform for Automated NGS Data Analysis and Sharing. CSI NGS Portal is an online platform for fully automated NGS data analysis and sharing . CSI NGS Portal uses a single, randomly generated, persistent, secure and http-only browser cookie ...		30%
National Wild Seed Resource Center The National Important Wild Plant Germplasm Repository has ten types of resources and data such as seeds, DNA, isolated materials, dried leaves, etc. totaling about 180,000 copies		30%
CANT-HYD Calgary approach to ANnoTating HYDrocarbon degradation genes (CANT-HYD), a database of 37 HMMs of marker genes involved in anaerobic and aerobic degradation pathways of aliphatic and aromatic hydrocarbons.		30%
REPIC A database for exploring N6-methyladenosine methylome. REPIC (RNA Epitranscriptome Collection) is a database dedicated to provide a new resource to investigate potential functions and mechanisms of N6-adenosine methylation (m6A) modifications. Curre ...		30%
Physical mapping data at Canada's Michael Smith Genome Sciences Centre - Data FPC Mapping data files from species that have been fingerprinted at Canada's Michael Smith Genome Sciences Centre (BCGSC).		30%
Combined QTL Map of Dairy Cattle Traits >>>!!! <<< 2021-09-01: repository is offline >>>!!!<<< Background: Many studies have been conducted to detect quantitative trait loci (QTL) in dairy cattle. However, these studies are diverse in terms of their differing resource populations, marker ...		30%

*ReputationScore indicates how established a given datasource is. Find out more.

Need help integrating and/or managing biomedical data?