Tag: sequence analysis


Found 84 sources
Source Match ReputationScore*

Pfam


The Pfam database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs). Pfam also generates higher-level groupings of related entries, known as clans. A clan is a collection of Pf ...
100%

GenBank


GenBank is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. The complete release notes for the current version of GenBank are available on the NCBI ftp site. A new release is made every two months. G ...
94%

SILVA


SILVA is a comprehensive, quality-controlled web resource for up-to-date aligned ribosomal RNA (rRNA) gene sequences from the Bacteria, Archaea and Eukaryota domains alongside supplementary online services. In addition to data products, SILVA provide ...
92%

Database of Single Nucleotide Polymorphism


dbSNP contains human single nucleotide variations, microsatellites, and small-scale insertions and deletions along with publication, population frequency, molecular consequence, and genomic and RefSeq mapping information for both common variations an ...
87%

Sequence Read Archive


The Sequence Read Archive (SRA) stores raw sequencing data from the next generation of sequencing platforms Data submitted to SRA. It is organized using a metadata model consisting of six objects: study, sample, experiment, run, analysis and submissi ...
86%

Conserved Domain Database


The Conserved Domain Database (CDD) brings together several collections of multiple sequence alignments representing conserved domains, including NCBI-curated domains, which use 3D-structure information to explicitly to define domain boundaries and p ...
80%

PROSITE


PROSITE is a database of protein families and domains. PROSITE consists of documentation entries describing protein domains, families and functional sites as well as associated patterns and profiles to identify them.
67%

NCBI Gene


The Entrez Global Query Cross-Database Search System is a federated search engine, or web portal that allows users to search many discrete health sciences databases at the National Center for Biotechnology Information (NCBI) website. Entrez can effic ...
66%

Reference Sequence Database


The Reference Sequence (RefSeq) collection aims to provide a comprehensive, integrated, non-redundant, well-annotated set of sequences, including genomic DNA, transcripts, and proteins.
63%

Information system for G protein-coupled receptors


The GPCRDB is a molecular-class information system that collects, combines, validates and stores large amounts of heterogenous data on G protein-coupled receptors (GPCRs). The GPCRDB contains data on sequences, ligand binding constants and mutations. ...
58%

Database resources of the National Center for Biotechnology Information


The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts publish ...
57%

European Hepatitis C Virus database


The euHCVdb is mainly oriented towards protein sequence, structure and function analyses and structural biology of Hepatitis C Virus.
53%

SwissRegulon


The Swissregulon Database contains genome-wide annotations of regulatory sites. The predictions are based on Bayesian probabilistic analysis of a combination of input information including i) Experimentally determined binding sites reported in the li ...
53%

Universal PBM Resource for Oligonucleotide Binding Evaluation


The UniPROBE (Universal PBM Resource for Oligonucleotide Binding Evaluation) database hosts data generated by universal protein binding microarray (PBM) technology on the in vitro DNA binding specificities of proteins.
52%

Ribosomal Database Project (RDP-II)


The Ribosomal Database Project - II (RDP-II)(1) provides data, tools and services related to ribosomal RNA sequences to the research community. Through its website (http://rdp.cme.msu.edu), RDP-II offers aligned and annotated rRNA sequence data, anal ...
51%

Phospho.ELM


Phospho.ELM is a manually curated database of eukaryotic phosphorylation sites. The resource includes data collected from published literature as well as high-throughput data sets.
51%

Gene3D


Gene3D uses the information in CATH to predict the locations of structural domains on millions of protein sequences available in public databases. Sequence data from UniProtKB and Ensembl for domains with no experimentally determined structures are s ...
50%

Ensembl Mouse Genome Browser


Analysis of finished and draft mouse genomic clone sequences.
50%

Candida Genome Database


The Candida Genome Database (CGD) provides access to genomic sequence data and manually curated functional information about genes and proteins of the human pathogen Candida albicans. It collects gene names and aliases, and assigns gene ontology term ...
50%

PASS2


PASS2 contains alignments of structural motifs of protein superfamilies. PASS2 is an automatic version of the original superfamily alignment database, CAMPASS (CAMbridge database of Protein Alignments organised as Structural Superfamilies). PASS2 con ...
50%

Ensembl Compara


Ensembl Compara provides cross-species resources and analyses, at both the sequence level and the gene level.
50%

IMGT/LIGM-DB


IMGT/LIGM-DB is the IMGT® comprehensive database of immunoglobulin (IG) and T cell receptor (TR) nucleotide sequences, from human and other vertebrate species, with translation for fully annotated sequences, created in 1989 by LIGM (http://www.imgt.o ...
49%

Codon Usage Database


Find GC content and frequency of codon usage for any organism that has a sequence in GenBank.
48%

GenomeNet


Network of database and computational resources including KEGG (pathways, interactions, etc.) and DBGET/LinkDB (an integrated database retrieval system). It also hosts several web-based tools for sequence analysis (i.e. Blast, Motif, Clustal W).
48%

HAMAP database of microbial protein families


HAMAP is a system, based on manual protein annotation, that identifies and semi-automatically annotates proteins that are part of well-conserved families or subfamilies: the HAMAP families. HAMAP is based on manually created family rules and is appli ...
47%

Bacterial protein tYrosine Kinase database


The Bacterial protein tYrosine Kinase database (BYKdb) contains computer-annotated BY-kinase sequences. The database web interface allows static and dynamic queries and provides integrated analysis tools including sequence annotation.
46%

HOGENOM


HOGENOM is a phylogenomic database providing families of homologous genes and associated phylogenetic trees (and sequence alignments) for a wide set sequenced organisms.
44%

PIR SuperFamily


The PIR SuperFamily concept is being used as a guiding principle to provide comprehensive and non-overlapping clustering of UniProtKB sequences into a hierarchical order to reflect their evolutionary relationships.
44%

APPRIS


Annotates variants with biological data such as protein structural information, functionally important residues, conservation of functional domains and evidence of cross-species conservation.
42%

EffectiveDB


The Effective database contains pre-calculated predictions of bacterial secreted proteins and of functional secretion systems. Effective bundles various tools to recognize Type III secretion signals, conserved binding sites of Type III chaperones, eu ...
42%

Autophagy Database


Proteins involved in self-digestion of eukaryotic cells
41%

Movebank Data Repository


This data repository allows users to publish animal tracking and animal-borne sensor datasets that have been uploaded to Movebank (https://www.movebank.org). Published datasets have gone through a submission and review process, and are associated wit ...
41%

Entrez


NCBI information retrieval system, including GenBank, MMDB (structures), genomes, population sets, OMIM, taxonomy and PubMed.
41%

Homologous Vertebrate Genes Database


HOVERGEN is a database of homologous vertebrate genes that allows one to select sets of homologous genes among vertebrate species, and to visualize multiple alignments and phylogenetic trees.
41%

ParameciumDB


ParameciumDB is a new model organism database for Paramecium, built using components of the Generic Model Organism Database (http://www.gmod.org) construction set (Chado relational database schema, Turnkey generic web framework and Gbrowse). The data ...
40%

tRNADB-CE


tRNA Gene DataBase Curated by Experts
39%

MAR databases


The MAR databases is a collection of manually curated marine microbial contextual and sequence databases, based at the Marine Metagenomics Portal. This was developed as a part of the ELIXIR EXCELERATE project in 2017 and is maintained by The Center f ...
39%

CR-EST - Crop ESTs


The crop EST database CR-EST (http://pgrc.ipk-gatersleben.de/cr-est/) is a publicly available online resource providing access to sequence, classification, clustering, and annotation data of crop EST projects at IPK Gatersleben, Germany. CR-EST curre ...
38%

ROdent Unidentified Gene-Encoded large proteins


The ROUGE protein database is a sister database of HUGE protein database which has accumulated the results of comprehensive sequence analysis of human long cDNAs (KIAA cDNAs). The ROUGE protein database has been created to publicize the information o ...
38%

Ebola and Hemorrhagic Fever Virus Database


The Ebola and Hemorrhagic Fever Virus Database stems from the Hemorrhagic Fever Viruses (HFV) Database Project founded by Dr. Carla Kuiken in 2009 at the Los Alamos National Laboratory (LANL). The HFV Database was modeled on the Los Alamos HIV Databa ...
37%

RNArchitecture


RNArchitecture is a database that provides a comprehensive description of relationships between known families of structured ncRNAs, with focus on sequence and structure similarities. RNArchitecture also provides literature information and links to o ...
37%

Nematodes.org


Wiki for coordinating nematode sequencing projects
37%

siRNAdb


The siRNA database provides a gene-centric view of human siRNA experimental data, including siRNAs of known efficacy and siRNAs predicted to be of high efficacy by siSearch. Linked to these sequences is information including siRNA thermodynamic prope ...
37%

GlycoPOST


GlycoPOST is a mass spectrometry data repository for glycomics. Users can release their "raw/processed" data via this site with a unique identifier number for the paper publication. Submission conditions are in accordance with the Minimum Information ...
36%

mESAdb


microRNA Expression and Sequence Analysis Database
36%

Nucleotide Sequence Database Collaboration


This database consists of a joint effort to collect and disseminate databases containing DNA and RNA sequences. It is a long-standing foundational initiative that operates between DDBJ, EMBL-EBI and NCBI. It covers the spectrum of data raw reads, th ...
36%

SIMAP


Protein sequences are of utmost importance for studying the function and evolution of genes and genomes. Therefore a rich collection of methods in computational biology relies on the analysis and comparison of protein sequences. Many of these intensi ...
36%

ElastoDB


Repository for well-characterized elastin sequences to facilitate its study. The database has since expanded to include other non-elastin sequences that share elastic properties.
35%

MatrisomeDB


The ECM-protein knowledge database. Please follow MatrisomeDB. MatrisomeDB will be hosted at matrisomedb.org very soon.
35%

Domain Interaction Graph Guided ExploreR


DIGGER is an essential resource for studying the mechanistic consequences of alternative splicing such as isoform-specific interaction and consequence of exon skipping. The database integrates information of domain-domain and protein-protein interact ...
35%

Alias


A tool for converting identifiers in which multiple aliases are used to refer to sequences. Also available as a stand-alone tool.
35%

HMS-ICS


The Hyperlink Management System (HMS) automatically updates and maintains hyperlinks among major databases using various data IDs (e.g. HUGO Gene Symbols, IDs from PDB, UniProt). The ID Converter System (ICS) supports the conversion of data IDs using ...
34%

EBI patent sequences


Non-redundant databases of patent DNA and protein sequences
34%

Kassiopeia


A web application for the generation, storage, and presentation of genome-wide analyses of mutually exclusive exonomes.
33%

Tabloid Proteome


an annotated database of protein associations. Tabloid Proteome is a database of protein association network generated using publically available mass spectrometry based experiments in PRIDE.These associations represent a broad scala of biological a ...
33%

DescribePROT


DescribePROT is a database containing annotations of 13 putative structural and functional properties at the amino acid level for ~1.4 million proteins from 83 popular/model organism, to be extended to hundreds of additional organisms. Users can sear ...
33%

Amordad


Database engine for comparing metagenomic data at massive scale. It first obtains the sequence signature of metagenomes and organizes them as points in high dimensional space.
32%

Mabellini


A genome-wide database for understanding the structural proteome and evaluating prospective antimicrobial targets of the emerging pathogen Mycobacterium abscessus. An on-line source for Mycobacterium abscessus modeled structural proteome. MabeLLINI ...
32%

MassIVE.quant


A community resource of quantitative mass spectrometry-based proteomics datasets. MassIVE.quant is an extension of the Mass Spectrometry Interactive Virtual Environment (MassIVE) to provide the opportunity for large-scale deposition of data from qua ...
32%

UniRef


The UniProt Reference Clusters are three separate datasets that compress sequence space at different resolutions, achieved by merging sequences and sub-sequences that are 100% (UniRef100), >=90% (UniRef90), or >=50% (UniRef50) identical, regardless o ...
32%

SARS-CoV-2 3D database


This tool is for understanding the coronavirus proteome and evaluating possible drug targets.
32%

MRMAssayDB


A Comprehensive Resource for Targeted Proteomics Assays in the Community.
31%

CrustyBase


CrustyBase is an interactive online database for crustacean transcriptomes. CrustyBase provides an environment for navigating and visualising crustacean transcriptome datasets. Users can search existing transcriptomes or import new datasets of their ...
31%

2DE-pattern


2DE-pattern is a database containing data on proteins/isoforms/proteoforms profiles.
31%

PIR - Protein Information Resource


The Protein Information Resource (PIR) is an integrated public bioinformatics resource that supports genomic and proteomic research and scientific studies. PIR has provided many protein databases and analysis tools to the scientific community, includ ...
30%

UniGene


UniGene collects entries of transcript sequences from transcription loci from genes or expressed pseudogenes. Entries also contain information on the protein similarities, gene expressions, cDNA clone reagents, and genomic locations.
30%

miROrtho


Computational prediction of animal microRNA genes
30%

DIGIT


DIGIT is a database of immunoglobulin variable domain sequences annotated with the type of antigen, the germline sequences and pairing information between light and heavy chains.
30%

SARSCOVIDB


New Platform for the Analysis of the Molecular Impact of SARS-CoV-2 Viral Infection.
28%

SDRDB


Short-chain dehydrogenases/reductases database.
28%

FLAD


Forensic loci allele database.
28%

Genome Trax


A search tool for finding variants from specific chromosome coordinates. It is possible to integrate the results in NGS pipeline.
28%

RSpred


A Rifin/Stevor prediction tool.
28%

DGD


Provides a list of groups of co-located and duplicated genes.
28%

NCBI PopSet


NCBI PopSet collects DNA sequences to analyze the ways that populations are related by evolution. Such sequences indicate if populations originate from different members of the same species or from organisms of different species entirely.
28%

ExVe


ExVe is the knowledge base of orthologous proteins identified in fungal extracellular vesicles.
28%

Entrez Protein Clusters


A collection of related protein sequences (clusters) consists of proteins derived from the annotations of whole genomes, organelles and plasmids. It currently limited to Archaea, Bacteria, Plants, Fungi, Protozoans, and Viruses.
28%

HPREP


A comprehensive database for human proteome repeats. Human Proteome Repeats Database. HPREP : HUMAN PROTEOME REPEATS.
28%

Antar


Predict miRNA targets for human and mouse using the two predictive models: a model trained from microarry studies following transfection, and a model trained from PAR-CLIP datasets.
28%

CEDAR


CEDAR (The ComplexomE profiling DAta Resource) facilitates the storage and sharing of complexome profiling data, compliant with the MIACE standard , with the goal to enable and simplify their reuse.
28%

AniProtDB


The Animal Proteome Database (AniProtDB) is a comprehensive collection of proteomes from 100 species spanning 21 animal phyla. In addition to providing open access to this collection of high-quality metazoan proteomes, information on predicted protei ...
28%

DBSAV database


DBSAV database reports GTS scores of human genes and DeepSAV scores of SAVs in the human proteome, including pathogenic SAVs, benign SAVs, gnomAD SAVs observed in exome sequencing, and all possible SAVs by single nucleotide variations. Each human pro ...
28%

FABRIC Cancer Portal


FABRIC Cancer Portal is a comprehensive catalogue of human coding genes in cancer based on the FABRIC framework. FABRIC quantifies the selection of genes in tumor and weighs their evidence for being cancer drivers.
28%

ILDGDB


A manually curated database of genomics, transcriptomics, proteomics and drug information for interstitial lung diseases. ILDGDB is a manually curated database that provides comprehensive experimentally supported associations between genes and inter ...
28%

*ReputationScore indicates how established a given datasource is. Find out more.




Need help integrating and/or managing biomedical data?