Tag: sequence analysis


Found 71 sources
Source Match ReputationScore*

Pfam


The Pfam database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs). Pfam also generates higher-level groupings of related entries, known as clans. A clan is a collection of Pf ...
100%

GenBank


GenBank is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. The complete release notes for the current version of GenBank are available on the NCBI ftp site. A new release is made every two months. G ...
91%

SILVA


SILVA is a comprehensive, quality-controlled web resource for up-to-date aligned ribosomal RNA (rRNA) gene sequences from the Bacteria, Archaea and Eukaryota domains alongside supplementary online services. In addition to data products, SILVA provide ...
90%

Sequence Read Archive


The Sequence Read Archive (SRA) stores raw sequencing data from the next generation of sequencing platforms Data submitted to SRA. It is organized using a metadata model consisting of six objects: study, sample, experiment, run, analysis and submissi ...
86%

Database of Single Nucleotide Polymorphism


dbSNP contains human single nucleotide variations, microsatellites, and small-scale insertions and deletions along with publication, population frequency, molecular consequence, and genomic and RefSeq mapping information for both common variations an ...
84%

Conserved Domain Database


The Conserved Domain Database (CDD) brings together several collections of multiple sequence alignments representing conserved domains, including NCBI-curated domains, which use 3D-structure information to explicitly to define domain boundaries and p ...
81%

NCBI Gene


The Entrez Global Query Cross-Database Search System is a federated search engine, or web portal that allows users to search many discrete health sciences databases at the National Center for Biotechnology Information (NCBI) website. Entrez can effic ...
68%

PROSITE


PROSITE consists of documentation entries describing protein domains, families and functional sites as well as associated patterns and profiles to identify them.
68%

Reference Sequence Database


The Reference Sequence (RefSeq) collection aims to provide a comprehensive, integrated, non-redundant, well-annotated set of sequences, including genomic DNA, transcripts, and proteins.
64%

PomBase


PomBase is a model organism database that provides organization of and access to scientific data for the fission yeast Schizosaccharomyces pombe. PomBase supports genomic sequence and features, genome-wide datasets and manual literature curation as w ...
59%

Information system for G protein-coupled receptors (GPCRs)


The GPCRDB is a molecular-class information system that collects, combines, validates and stores large amounts of heterogenous data on G protein-coupled receptors (GPCRs). The GPCRDB contains data on sequences, ligand binding constants and mutations. ...
59%

NCBI


The National Center for Biotechnology Information advances science and health by providing access to biomedical and genomic information
56%

UniGene gene-oriented nucleotide sequence clusters


Each UniGene entry is a set of transcript sequences that appear to come from the same transcription locus (gene or expressed pseudogene), together with information on protein similarities, gene expression, cDNA clone reagents, and genomic location.
56%

European Hepatitis C Virus database


The euHCVdb is mainly oriented towards protein sequence, structure and function analyses and structural biology of Hepatitis C Virus.
55%

SwissRegulon


The Swissregulon Database contains genome-wide annotations of regulatory sites. The predictions are based on Bayesian probabilistic analysis of a combination of input information including i) Experimentally determined binding sites reported in the li ...
54%

Universal PBM Resource for Oligonucleotide Binding Evaluation


The UniPROBE (Universal PBM Resource for Oligonucleotide Binding Evaluation) database hosts data generated by universal protein binding microarray (PBM) technology on the in vitro DNA binding specificities of proteins.
54%

Ribosomal Database Project (RDP-II)


The Ribosomal Database Project - II (RDP-II)(1) provides data, tools and services related to ribosomal RNA sequences to the research community. Through its website (http://rdp.cme.msu.edu), RDP-II offers aligned and annotated rRNA sequence data, anal ...
53%

Phospho.ELM


Phospho.ELM is a manually curated database of eukaryotic phosphorylation sites. The resource includes data collected from published literature as well as high-throughput data sets.
52%

Candida Genome Database


The Candida Genome Database (CGD) provides access to genomic sequence data and manually curated functional information about genes and proteins of the human pathogen Candida albicans. It collects gene names and aliases, and assigns gene ontology term ...
52%

PASS2


PASS2 contains alignments of structural motifs of protein superfamilies. PASS2 is an automatic version of the original superfamily alignment database, CAMPASS (CAMbridge database of Protein Alignments organised as Structural Superfamilies). PASS2 con ...
52%

Ensembl Mouse Genome Browser


Analysis of finished and draft mouse genomic clone sequences.
51%

Ensembl Compara


Ensembl Compara provides cross-species resources and analyses, at both the sequence level and the gene level.
51%

IMGT/LIGM-DB


IMGT/LIGM-DB is the IMGT® comprehensive database of immunoglobulin (IG) and T cell receptor (TR) nucleotide sequences, from human and other vertebrate species, with translation for fully annotated sequences, created in 1989 by LIGM (http://www.imgt.o ...
50%

HAMAP database of microbial protein families


HAMAP is a system, based on manual protein annotation, that identifies and semi-automatically annotates proteins that are part of well-conserved families or subfamilies: the HAMAP families. HAMAP is based on manually created family rules and is appli ...
49%

Codon Usage Database


Find GC content and frequency of codon usage for any organism that has a sequence in GenBank.
49%

GenomeNet


Network of database and computational resources including KEGG (pathways, interactions, etc.) and DBGET/LinkDB (an integrated database retrieval system). It also hosts several web-based tools for sequence analysis (i.e. Blast, Motif, Clustal W).
49%

Gene3D


Gene3D takes CATH domain families (from PDB structures) and assigns them to the millions protein sequences (using Hidden Markov models generated from HMMER) with no PDB structures.
49%

Bacterial protein tYrosine Kinase database


The Bacterial protein tYrosine Kinase database (BYKdb) contains computer-annotated BY-kinase sequences. The database web interface allows static and dynamic queries and provides integrated analysis tools including sequence annotation.
48%

PIR SuperFamily


The PIR SuperFamily concept is being used as a guiding principle to provide comprehensive and non-overlapping clustering of UniProtKB sequences into a hierarchical order to reflect their evolutionary relationships.
46%

HOGENOM


HOGENOM is a phylogenomic database providing families of homologous genes and associated phylogenetic trees (and sequence alignments) for a wide set sequenced organisms.
46%

EffectiveDB


The Effective database contains pre-calculated predictions of bacterial secreted proteins and of functional secretion systems. Effective bundles various tools to recognize Type III secretion signals, conserved binding sites of Type III chaperones, eu ...
43%

Autophagy Database


Proteins involved in self-digestion of eukaryotic cells
43%

Homologous Vertebrate Genes Database


HOVERGEN is a database of homologous vertebrate genes that allows one to select sets of homologous genes among vertebrate species, and to visualize multiple alignments and phylogenetic trees.
43%

Entrez


NCBI information retrieval system, including GenBank, MMDB (structures), genomes, population sets, OMIM, taxonomy and PubMed.
42%

ParameciumDB


ParameciumDB is a new model organism database for Paramecium, built using components of the Generic Model Organism Database (http://www.gmod.org) construction set (Chado relational database schema, Turnkey generic web framework and Gbrowse). The data ...
41%

tRNADB-CE


tRNA Gene DataBase Curated by Experts
40%

MAR databases


The MAR databases is a collection of manually curated marine microbial contextual and sequence databases, based at the Marine Metagenomics Portal. This was developed as a part of the ELIXIR EXCELERATE project in 2017 and is maintained by The Center f ...
40%

CR-EST - Crop ESTs


The crop EST database CR-EST (http://pgrc.ipk-gatersleben.de/cr-est/) is a publicly available online resource providing access to sequence, classification, clustering, and annotation data of crop EST projects at IPK Gatersleben, Germany. CR-EST curre ...
40%

Ebola and Hemorrhagic Fever Virus Database


The Ebola and Hemorrhagic Fever Virus Database stems from the Hemorrhagic Fever Viruses (HFV) Database Project founded by Dr. Carla Kuiken in 2009 at the Los Alamos National Laboratory (LANL). The HFV Database was modeled on the Los Alamos HIV Databa ...
39%

ROdent Unidentified Gene-Encoded large proteins


The ROUGE protein database is a sister database of HUGE protein database which has accumulated the results of comprehensive sequence analysis of human long cDNAs (KIAA cDNAs). The ROUGE protein database has been created to publicize the information o ...
39%

Nematodes.org


Wiki for coordinating nematode sequencing projects
39%

siRNAdb


The siRNA database provides a gene-centric view of human siRNA experimental data, including siRNAs of known efficacy and siRNAs predicted to be of high efficacy by siSearch. Linked to these sequences is information including siRNA thermodynamic prope ...
39%

mESAdb


microRNA Expression and Sequence Analysis Database
38%

Nucleotide Sequence Database Collaboration


This database consists of a joint effort to collect and disseminate databases containing DNA and RNA sequences. It is a long-standing foundational initiative that operates between DDBJ, EMBL-EBI and NCBI. It covers the spectrum of data raw reads, th ...
37%

SIMAP


Protein sequences are of utmost importance for studying the function and evolution of genes and genomes. Therefore a rich collection of methods in computational biology relies on the analysis and comparison of protein sequences. Many of these intensi ...
37%

ElastoDB


Repository for well-characterized elastin sequences to facilitate its study. The database has since expanded to include other non-elastin sequences that share elastic properties.
36%

Alias


A tool for converting identifiers in which multiple aliases are used to refer to sequences. Also available as a stand-alone tool.
36%

HMS-ICS


The Hyperlink Management System (HMS) automatically updates and maintains hyperlinks among major databases using various data IDs (e.g. HUGO Gene Symbols, IDs from PDB, UniProt). The ID Converter System (ICS) supports the conversion of data IDs using ...
36%

EBI patent sequences


Non-redundant databases of patent DNA and protein sequences
36%

MatrisomeDB


The ECM-protein knowledge database. Please follow MatrisomeDB. MatrisomeDB will be hosted at matrisomedb.org very soon.
35%

GlycoPOST


Raw Mass Spectrometry glycomics data
34%

Tabloid Proteome


an annotated database of protein associations. Tabloid Proteome is a database of protein association network generated using publically available mass spectrometry based experiments in PRIDE.These associations represent a broad scala of biological a ...
34%

Kassiopeia


A web application for the generation, storage, and presentation of genome-wide analyses of mutually exclusive exonomes.
34%

UniRef


The UniProt Reference Clusters are three separate datasets that compress sequence space at different resolutions, achieved by merging sequences and sub-sequences that are 100% (UniRef100), >=90% (UniRef90), or >=50% (UniRef50) identical, regardless o ...
34%

Amordad


Database engine for comparing metagenomic data at massive scale. It first obtains the sequence signature of metagenomes and organizes them as points in high dimensional space.
34%

Mabellini


A genome-wide database for understanding the structural proteome and evaluating prospective antimicrobial targets of the emerging pathogen Mycobacterium abscessus. An on-line source for Mycobacterium abscessus modeled structural proteome. MabeLLINI ...
33%

2DE-pattern


2DE-pattern is a database containing data on proteins/isoforms/proteoforms profiles.
32%

APPRIS


Annotates variants with biological data such as protein structural information, functionally important residues, conservation of functional domains and evidence of cross-species conservation.
32%

PIR - Protein Information Resource


The Protein Information Resource (PIR) is an integrated public bioinformatics resource that supports genomic and proteomic research and scientific studies. PIR has provided many protein databases and analysis tools to the scientific community, includ ...
32%

miROrtho


Computational prediction of animal microRNA genes
32%

DIGIT


DIGIT is a database of immunoglobulin variable domain sequences annotated with the type of antigen, the germline sequences and pairing information between light and heavy chains.
32%

SDRDB


Short-chain dehydrogenases/reductases database.
30%

FLAD


Forensic loci allele database.
30%

Genome Trax


A search tool for finding variants from specific chromosome coordinates. It is possible to integrate the results in NGS pipeline.
30%

RSpred


A Rifin/Stevor prediction tool.
30%

MDR database


Medium-chain dehydrogenases/reductases database.
30%

DGD


Provides a list of groups of co-located and duplicated genes.
30%

NCBI PopSet


NCBI PopSet collects DNA sequences to analyze the ways that populations are related by evolution. Such sequences indicate if populations originate from different members of the same species or from organisms of different species entirely.
30%

Entrez Protein Clusters


A collection of related protein sequences (clusters) consists of proteins derived from the annotations of whole genomes, organelles and plasmids. It currently limited to Archaea, Bacteria, Plants, Fungi, Protozoans, and Viruses.
30%

Antar


Predict miRNA targets for human and mouse using the two predictive models: a model trained from microarry studies following transfection, and a model trained from PAR-CLIP datasets.
30%

AniProtDB


The Animal Proteome Database (AniProtDB) is a comprehensive collection of proteomes from 100 species spanning 21 animal phyla. In addition to providing open access to this collection of high-quality metazoan proteomes, information on predicted protei ...
30%

*ReputationScore indicates how established a given datasource is. Find out more.




Need help integrating and/or managing biomedical data?