Tag: dna sequences

Found 97 sources

Source	Match	ReputationScore*
UCSC Genome Browser database Genome assemblies and aligned annotations for a wide range of vertebrates and model organisms, along with an integrated tool set for visualizing, comparing, analyzing and sharing both publicly available and user-generated genomic datasets.		100%
GenBank GenBank is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. The complete release notes for the current version of GenBank are available on the NCBI ftp site. A new release is made every two months. G ...		82%
Sequence Read Archive The Sequence Read Archive (SRA) stores raw sequencing data from the next generation of sequencing platforms Data submitted to SRA. It is organized using a metadata model consisting of six objects: study, sample, experiment, run, analysis and submissi ...		75%
The European Genome-phenome Archive The European Genome-phenome Archive (EGA) allows you to explore datasets from genomic studies, provided by a range of data providers. Access to datasets must be approved by the specified Data Access Committee (DAC).		68%
ARLEQUIN Project Format Arlequin ver 3.0 is a software package integrating several basic and advanced methods for population genetics data analysis, like the computation of standard genetic diversity indices, the estimation of allele and haplotype frequencies, tests of depa ...		65%
The UCSC Archaeal Genome Browser The UCSC Archaeal Genome Browser is a window on the biology of more than 100 microbial species from the domain Archaea. Basic gene annotation is derived from NCBI Genbank/RefSeq entries, with overlays of sequence conservation across multiple species, ...		64%
European Variation Archive The European Variation Archive is an open-access archive that accepts submission of, and provides access to, all types of genetic variation data from all species. All users are able to download any dataset, or query our study catalogue via our variat ...		62%
modMine modMine is an integrated web resource of data and tools to browse and search modENCODE data and experimental details, download results and access the GBrowse genome browser.		60%
MEROPS The MEROPS database is an information resource for peptidases (also termed proteases, proteinases and proteolytic enzymes) and the proteins that inhibit them.		59%
European Nucleotide Archive The European Nucleotide Archive (ENA) is a globally comprehensive data resource for nucleotide sequence, spanning raw data, alignments and assemblies, functional and taxonomic annotation and rich contextual data relating to sequenced samples and expe ...		59%
VectorBase VectorBase is a web-accessible data repository for information about invertebrate vectors of human pathogens. VectorBase annotates and maintains vector genomes (as well as a number of non-vector genomes for comparative analysis) providing an integrat ...		59%
Gramene: A curated, open-source, integrated data resource for comparative functional genomics in plants Gramene's purpose is to provide added value to plant genomics data sets available within the public sector, which will facilitate researchers' ability to understand the plant genomes and take advantage of genomic sequence known in one species for ide ...		59%
The Arabidopsis Information Resource The Arabidopsis Information Resource (TAIR) maintains a database of genetic and molecular biology data for the model higher plant Arabidopsis thaliana. Data available from TAIR includes the complete genome sequence along with gene structure, gene pro ...		57%
NCBI Gene The Entrez Global Query Cross-Database Search System is a federated search engine, or web portal that allows users to search many discrete health sciences databases at the National Center for Biotechnology Information (NCBI) website. Entrez can effic ...		57%
Ensembl Genomes The Ensembl genome annotation system, developed jointly by EMBL-EBI and the Wellcome Trust Sanger Institute, has been used for the annotation, analysis and display of vertebrate genomes since 2000. Since 2009, the Ensembl site has been complemented b ...		56%
Integrated Microbial Genomes And Microbiomes The Integrated Microbial Genomes (IMG/M) aims to support the annotation, analysis and distribution of microbial genome and microbiome datasets sequenced at DOE's Joint Genome Institute (JGI). It also serves as a community resource for analysis and an ...		53%
Comprehensive Antibiotic Resistance Database A bioinformatic database of antimicrobial resistance genes, their products and associated phenotypes.		53%
MGnify EBI Metagenomics has changed its name to MGnify to reflect a change in scope. This is a free-to-use resource aiming at supporting all metagenomics researchers. The service is an automated pipeline for the analysis and archiving of metagenomic data th ...		53%
CRISPRCasdb CRISPRCasdb acts as a gateway to a publicly accessible database and software to enable the easy detection of CRISPR sequences in locally-produced data and the consultation of CRISPR sequence data present in the database. It also gives information on ...		52%
PCR Primer Database for Gene Expression Detection and Quantification PrimerBank is a public resource for PCR primers. These primers are designed for gene expression detection or quantification (real-time PCR). PrimerBank contains over 306,800 primers covering most known human and mouse genes. There are several ways to ...		50%
PomBase PomBase is a model organism database that provides organization of and access to scientific data for the fission yeast Schizosaccharomyces pombe. PomBase supports genomic sequence and features, genome-wide datasets and manual literature curation as w ...		50%
Ensembl Protists Ensembl Protists stores protist genomes of interest, covering those involved in disease and of scientific interest. This includes genomes such as Plasmodium falciparum, Dictyostelium discoideum, Phytophthora infestans and Leishmania major. A majority ...		49%
Ensembl Metazoa Ensembl Metazoa provides access to genomes of metazoans of interest in disease, environmental sciences, agriculture and economic concern. Extensive coverage exists of diptera, nematodes, lepidoptera and hymenoptera.		49%
Virulence Factor Database VFDB is an integrated and comprehensive database of virulence factors for bacterial pathogens (also including Chlamydia and Mycoplasma).		48%
Eukaryotic Promoter Database The Eukaryotic Promoter Database (EPD) provides accurate transcription start site (TSS) information for promoters of 15 model organisms, from human to yeast to the malaria parasite Plasmodium falciparum. While the original database was a manually cur ...		48%
NCBI Virus NCBI Virus is a community portal for viral sequence data from RefSeq, GenBank and other NCBI repositories.		48%
Minimum Information about a MARKer gene Sequence MIMARKS is the metadata reporting standard of the Genomic Standards Consortium that covers marker gene sequences from environmental surveys or individual organisms		48%
TDR Targets TDR Targets integrates chemical and genomic information and allows users to prioritize targets and compounds to develop and repurpose new drugs and chemical tools for human pathogens. The TDR Target Project was started in 2005 after a call for applic ...		48%
Synthetic Biology Open Language The Synthetic Biology Open Language (SBOL) is a standard used for the in silico representation of genetic designs. SBOL is designed to allow synthetic biologists and genetic engineers to electronically exchange designs, send and receive genetic desig ...		46%
Ensembl Fungi Ensembl Fungi is a browser for fungal genomes. A majority of these are taken from the databases of the International Nucleotide Sequence Database Collaboration (the European Nucleotide Archive at the EBI, GenBank at the NCBI, and the DNA Database of ...		45%
Ensembl Bacteria Ensembl Bacteria is a browser for bacterial and archaeal genomes. These are taken from the databases of the International Nucleotide Sequence Database Collaboration (the European Nucleotide Archive at the EBI, GenBank at the NCBI, and the DNA Databas ...		45%
Progenetix - genomic copy number aberrations in cancer The Progenetix database provides an overview of copy number abnormalities in human cancer from Comparative Genomic Hybridization (CGH) experiments. With 30817 cases from 1016 publications (Oct 2013), Progenetix is the largest curated database for who ...		45%
DNA Data Bank of Japan An annotated collection of all publicly available nucleotide and protein sequences. DDBJ collects sequence data mainly from Japanese researchers, as well as researchers in other countries. DDBJ is part of the International Nucleotide Sequence Databas ...		45%
Minimum Information about any (x) Sequence The minimum information about any (x) sequence (MIxS) is an overarching framework of sequence metadata, that includes technology-specific checklists from the previous MIGS and MIMS standards, provides a way of introducing additional checklists such a ...		44%
Dfam The Dfam database is a open collection of DNA Transposable Element sequence alignments, hidden Markov Models (HMMs), consensus sequences, and genome annotations. Dfam represents a collection of multiple sequence alignments, each containing a set of r ...		44%
Genome Database for Rosaceae The Genome Database for Rosaceae (GDR) is a curated and integrated web-based relational database providing centralized access to Rosaceae genomics and genetics data and analysis tools to facilitate cross-species utilization of data.		44%
Genetic and Genomic Information System GnpIS is a multispecies integrative information system dedicated to plant and fungi pests. It bridges genetic and genomic data, allowing researchers access to both genetic information (e.g. genetic maps, quantitative trait loci, association genetics, ...		44%
Database of Sequence Tagged Sites dbSTS is an NCBI resource that contains sequence data for short genomic landmark sequences or Sequence Tagged Sites.		43%
Influenza Virus Resource Influenza Virus Resource presents data obtained from the NIAID Influenza Genome Sequencing Project as well as from GenBank, combined with tools for flu sequence analysis, annotation and submission to GenBank. In addition, it provides links to other r ...		43%
Candida Genome Database The Candida Genome Database (CGD) provides access to genomic sequence data and manually curated functional information about genes and proteins of the human pathogen Candida albicans. It collects gene names and aliases, and assigns gene ontology term ...		43%
Ensembl Plants Ensembl Plants holds the genomes of plants of significant interest. These range from those of agricultural importance, those which support primary research and of environmental interest. Ensembl Plants datasets are constructed in a direct collaborati ...		42%
Rice Genome Annotation Project This website provides genome sequence from the Nipponbare subspecies of rice and annotation of the 12 rice chromosomes. These data are available through search pages and the Genome Browser that provides an integrated display of annotation data.		42%
Structure Function Linkage Database Archive Structure Function Linkage Database (SFLD) is a database of enzymes classified by linking sequences to chemical function. A hierachical systems is used to classify enzymes by family or superfamily other category levels include functional domain, subg ...		42%
Regulatory Element Database for Drosophila REDfly is a curated collection of known Drosophila transcriptional cis-regulatory modules (CRMs) and transcription factor binding sites (TFBSs). REDfly seeks to include all experimentally verified fly regulatory elements along with their DNA sequence ...		41%
Genome Warehouse The Genome Warehouse (GWH) is a public archival resource housing genome-scale data for a wide range of species. GWH accepts a variety of data types, including whole genome, chloroplast, mitochondrion and plasmid. For each collected genome assembly, G ...		41%
Fungal and Oomycete genomics resource FungiDB is an integrated genomic and functional genomic database for the kingdom Fungi. The database integrates whole genome sequence and annotation and also includes experimental and environmental isolate sequence data. The database includes compara ...		40%
Genetic Codes NCBI takes great care to ensure that the translation for each coding sequence (CDS) present in GenBank records is correct. Central to this effort is careful checking on the taxonomy of each record and assignment of the correct genetic code for each o ...		40%
The Chromosome 7 Annotation Project The objective of this project is to generate the most comprehensive description of human chromosome 7 to facilitate biological discovery, disease gene research and medical genetic applications.		38%
CoryneRegNet 6.0 - Corynebacterial Regulation Network Corynebacterial Regulation Network a reference database and analysis platform for corynebacterial transcription factors and gene regulatory networks.		38%
Human Gene and Protein Database Human Gene and Protein Database (HGPD) presents SDS-PAGE patterns and other informations of human genes and proteins.		38%
Minimal Information About a Phylogenetic Analysis The MIAPA (minimum information about a phylogenetic analysis) checklist details the list of metadata necessary for researchers to evaluate or reuse a published phylogeny.		37%
Generic Feature Format Version 3 The Generic Feature Format Version 3 (GFF3) format was developed after earlier formats, although widely used, became fragmented into multiple incompatible dialects. The GFF3 format addresses the most common extensions to GFF, while preserving backwar ...		37%
Prokaryotic Operon DataBase The Prokaryotic Operon DataBase (ProOpDB) constitutes one of the most precise and complete repository of operon predictions in our days. Using our novel and highly accurate operon algorithm, we have predicted the operon structures of more than 1,200 ...		37%
BacMap BacMap is a picture atlas of annotated bacterial genomes. It is an interactive visual database containing hundreds of fully labeled, zoomable, and searchable maps of bacterial genomes.		37%
ProPortal ProPortal is a database containing genomic, metagenomic, transcriptomic and field data for the marine cyanobacterium Prochlorococcus. They provide a source of cross-referenced data across multiple scales of biological organization—from the genome to ...		37%
The Yeast Metabolome DataBase The Yeast Metabolome Database (YMDB) is a manually curated database of small molecule metabolites found in or produced by Saccharomyces cerevisiae (also known as Baker’s yeast and Brewer’s yeast). This database covers metabolites described in textboo ...		37%
Information Commons for Rice Information Commons for Rice (IC4R) is a rice knowledgebase that incorporates rice data through multiple modules such as genome-wide expression profiles derived entirely from RNA-Seq data, resequencing-based genomic variations obtained from re-sequen ...		36%
Minimal Metagenome Sequence Analysis Standard A proposed set of minimal standard analyses necessary for proper interpretation of meta-omic data and to allow comparative metagenomics and metatranscriptomics. Please note: We cannot find an up-to-date website for this resource. As such, we have mar ...		36%
euL1db, the European database of L1-HS retrotransposon insertions in humans Retrotransposons, which comprises LINE, SINE and LTR-containing elements, accounts for almost half of our genome (Fig. 1). They are mobile genetics elements - also known as jumping genes - but only the L1-HS subfamily has retained the ability to jump ...		36%
LegumeIP The LegumeIP 2.0 database hosts large-scale genomics and transcriptomics data and provides integrative bioinformatics tools for the study of gene function and evolution in legumes.		36%
Central Aspergillus Data REpository This project aims to support the international Aspergillus research community by gathering all genomic information regarding this significant genus into one resource - The Central Aspergillus REsource (CADRE). CADRE facilitates visualisation and anal ...		36%
CRAM CRAM is a sequencing read file format that is highly space efficient by using reference-based compression of sequence data and offers both lossless and lossy modes of compression. Building on early proof-of-principle for reference-based compression ( ...		36%
Plant Natural Antisense Transcripts Database Natural Antisense Transcripts (NATs), a kind of regulatory RNAs, occur prevalently in plant genomes and play significant roles in physiological and/or pathological processes. PlantNATsDB (Plant Natural Antisense Transcripts DataBase) is a platform fo ...		35%
GENI-ACT GENI-ACT is a resource that allows the research community to collaboratively annotate bacterial genomes. Changes can be suggested to existing genomes and these alterations can be ported back to NCBI Genbank. GENI-ACT also has modules which can be use ...		34%
VIRsiRNAdb VIRsiRNAdb contains information on experimentally validated Viral siRNA/shRNA which target viral genome regions. It provides efficacy information where available, as well as the siRNA sequence, viral target and subtype, as well as the target genomic ...		34%
StellaBase StellaBase is the Nematostella vectensis genomics database.		34%
MAR databases The MAR databases is a collection of manually curated marine microbial contextual and sequence databases, based at the Marine Metagenomics Portal. This was developed as a part of the ELIXIR EXCELERATE project in 2017 and is maintained by The Center f ...		34%
Minimal Information about a high throughput SEQuencing Experiment MINSEQE describes the Minimum Information about a high-throughput nucleotide SEQuencing Experiment that is needed to enable the unambiguous interpretation and facilitate reproduction of the results of the experiment. By analogy to the MIAME guideline ...		34%
Short Read Archive eXtensible Markup Language The SRA data model contains the following objects: Study: information about the sequencing project Sample: information about the sequenced samples Experiment: information about the libraries, platform; associated with study, sample(s) and run(s) Run: ...		34%
Type IV Secretion system Resource A web-based bacterial type IV secretion system resource for type IV secretion systems (T4SSs) and cognate effectors in bacteria.		34%
Alternative Poly(A) Sites database APASdb can visualize the precise map and usage quantification of different APA isoforms for all genes. The datasets are deeply profiled by the sequencing alternative polyadenylation sites (SAPAS) method capable of high-throughput sequencing 3'-ends o ...		34%
Human Disease-Related Viral Integration Sites Dr.VIS collects and locates human disease-related viral integration sites. So far, about 600 sites covering 5 virus organisms and 11 human diseases are available. Integration sites in Dr.VIS are located against chromesome, cytoband, gene and refseq p ...		33%
Interrupted coding sequences ICDS database is a database containing ICDS detected by a similarity-based approach. The definition of each interrupted gene is provided as well as the ICDS genomic localisation with the surrounding sequence.		33%
Global Initiative on Sharing Avian Influenza Data The GISAID Initiative promotes the international sharing of all influenza virus sequences, related clinical and epidemiological data associated with human viruses, and geographical as well as species-specific data associated with avian and other anim ...		32%
XenMine XenMine has been created to view, search and analyze Xenopus data, and provides essential information on gene expression changes and regulatory elements present in the genome. It contains published genomic datasets from both Xenopus tropicalis and Xe ...		32%
Oryza Tag Line Oryza Tag Line consists in a searchable database developed under the Oracle management system integrating phenotypic data resulting from the evaluation of the Genoplante rice insertion line library.		32%
Enzyme Structure Function Ontology The ESFO provides a new paradigm for organizing enzyme sequence, structure, and function information, whereby specific elements of enzyme sequence and structure are mapped to specific conserved aspects of function, thus facilitating the functional an ...		31%
OryGenesDB: an interactive tool for rice reverse genetics The aim of this Oryza sativa database was first to display sequence information such as the T-DNA and Ds flanking sequence tags (FSTs) produced in the framework of the French genomics initiative Genoplante and the EU consortium Cereal Gene Tags. This ...		31%
EcoliWiki: A Wiki-based community resource for Escherichia coli EcoliWiki is a community-based resource for the annotation of all non-pathogenic E. coli, its phages, plasmids, and mobile genetic elements.		31%
BEDgraph The bedGraph format allows display of continuous-valued data in track format. This display type is useful for probability scores and transcriptome data. This track type is similar to the wiggle (WIG) format, but unlike the wiggle format, data exporte ...		30%
Silkworm Pathogen Database Silkworm Pathogen Database (SilkPathDB) is a comprehensive resource for studying on pathogens of silkworm, including microsporidia, fungi, bacteria and virus. SilkPathDB provides access to not only genomic data including functional annotation of gene ...		30%
BCL-2 Database BCL2DB is a database designed to integrate data on BCL-2 family members and BH3-only proteins.		30%
Transmembrane Helices in Genome Sequences A web based database of Transmembrane Helices in Genome Sequences.		30%
Bioconductor Bioconductor provides tools for the analysis and comprehension of high-throughput genomic data. Bioconductor uses the R statistical programming language, and is open source and open development.		29%
Chickpea Portal This resource contains genome and gene sequences, features and isolationed chromosome alignments, while functional annotation can be searched in GBrowse. Chickpea forms a critical component of the Australian and Indian farming system, offering offer ...		29%
Acytostelium Gene Database Genome and transcriptome database of Acytostelium subglobosum		29%
Cnidarian Evolutionary Genomics Database CnidBase, the Cnidarian Evolutionary Genomics Database, is a tool for investigating the evolutionary, developmental and ecological factors that affect gene expression and gene function in cnidarians.		29%
Bio-Mirror A world bioinformatic public service for high-speed access to up-to-date DNA & protein biological sequence databanks.		28%
Access to Biological Collection Data DNA extension ABCDDNA is a theme specific extension for ABCD (Access to Biological Collections Data) created to facilitate storage and exchange of data related to DNA collection units, such as DNA extraction specifics, DNA quality parameters, and data characterisi ...		28%
Multiple Alignment Format The Multiple Alignment Format stores DNA level multiple alignments in an easily readable format between entire genomes. Unlike previous formats this resource can cope with forward and reverse strand directions, multiple pieces to the alignment, and s ...		28%
Genome Variation Format The Genome Variation Format (GVF) is a very simple file format for describing sequence alteration features at nucleotide resolution relative to a reference genome.		28%
GenBank Sequence Format GenBank Sequence Format (GenBank Flat File Format) consists of an annotation section and a sequence section. The start of the annotation section is marked by a line beginning with the word "LOCUS". The start of sequence section is marked by a line be ...		28%
Gene Transfer Format The Gene transfer format (GTF) is a file format used to hold information about gene structure. It is a tab-delimited text format based on the general feature format (GFF), but contains some additional conventions specific to gene information. A signi ...		28%
DDBJ Sequence Read Archive DDBJ Sequence Read Archive (DRA) is an archive database for output data generated by next-generation sequencing machines including Roche 454 GS System®, Illumina Genome Analyzer®, Applied Biosystems SOLiD® System, and others. DRA is a member of the I ...		26%
National Omics Data Encyclopedia The National Omics Data Encyclopedia (NODE) is big data library with complete and integrative data storage, safe and efficiency-guaranteed data management as well as comprehensive and user-friendly data service functions. NODE stores raw sequence dat ...		26%
Binary sequence information Format A .2bit file stores multiple DNA sequences (up to 4 Gb total) in a compact randomly-accessible format. The file contains masking information as well as the DNA itself. The DNA sequence is represented as two bits per pixel with associated list of regi ...		24%
.ACE format The ACE file format is a specification for storing data about genomic contigs. The original ACE format was developed for use with Consed, a program for viewing, editing, and finishing DNA sequence assemblies. ACE files are generated by various assemb ...		24%

*ReputationScore indicates how established a given datasource is. Find out more.

Need help integrating and/or managing biomedical data?