Inverting the model of genomics data sharing with the NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space.
PMID:35199087
GA4GH: International policies and standards for data sharing across genomic research and healthcare.
PMID:35072136
CRAM 3.1: Advances in the CRAM File Format.
PMID:34999766
SamQL: a structured query language and filtering tool for the SAM/BAM file format.
PMID:34600480
PIGG defines the Emm blood group system.
PMID:34535746
Hamming-shifting graph of genomic short reads: Efficient construction and its application for compression.
PMID:34280186
Refget: standardised access to reference sequences.
PMID:34260694
FASTA/Q data compressors for MapReduce-Hadoop genomics: space and time savings made easy.
PMID:33752596
Megadepth: efficient coverage quantification for BigWigs and BAMs.
PMID:33693500
HTSlib: C library for reading/writing high-throughput sequencing data.
PMID:33594436
Twelve years of SAMtools and BCFtools.
PMID:33590861
Sharp Second-Order Pointwise Asymptotics for Lossless Compression with Side Information.
PMID:33286477
Comparison of Compression-Based Measures with Application to the Evolution of Primate Genomes.
PMID:33265483
Efficient DNA sequence compression with neural networks.
PMID:33179040
Revealing Prognosis-Related Pathways at the Individual Level by a Comprehensive Analysis of Different Cancer Transcription Data.
PMID:33138076
Practical guide for managing large-scale human genome data in research.
PMID:33097812
IonCRAM: a reference-based compression tool for ion torrent sequence files.
PMID:32907531
Chromatin binding of FOXA1 is promoted by LSD1-mediated demethylation in prostate cancer.
PMID:32868907
Towards standardization guidelines for in silico approaches in personalized medicine.
PMID:32827396
A systematic comparison of pharmacogene star allele calling bioinformatics algorithms: a focus on CYP2D6 genotyping.
PMID:32789024
Practical estimation of cloud storage costs for clinical genomic data.
PMID:32529017
Vertical lossless genomic data compression tools for assembled genomes: A systematic literature review.
PMID:32453750
Genomic Sequencing Capacity, Data Retention, and Personal Access to Raw Data in Europe.
PMID:32435258
How Can Law and Policy Advance Quality in Genomic Analysis and Interpretation for Clinical Care?
PMID:32342785
Tximeta: Reference sequence checksums for provenance identification in RNA-seq.
PMID:32097405
GABAC: an arithmetic coding solution for genomic data.
PMID:31830243
svtools: population-scale analysis of structural variation.
PMID:31218349
Mind the gap: resources required to receive, process and interpret research-returned whole genome data.
PMID:31161416
Cram-JS: reference-based decompression in node and the browser.
PMID:31099383
Genomic Analysis in the Age of Human Genome Sequencing.
PMID:30901550
Tackling the Challenges of FASTQ Referential Compression.
PMID:30792576
Alfred: interactive multi-sample BAM alignment statistics, feature counting and feature annotation for long- and short-read sequencing.
PMID:30520945
TRCMGene: A two-step referential compression method for the efficient storage of genetic data.
PMID:30395579
BdBG: a bucket-based method for compressing genome sequencing data with dynamic de Bruijn graphs.
PMID:30364599
Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects.
PMID:30279509
Review of applications of high-throughput sequencing in personalized medicine: barriers and facilitators of future progress in research and clinical application.
PMID:30084865
The Terabase Search Engine: a large-scale relational database of short-read sequences.
PMID:30052772
Crumble: reference free lossy compression of sequence quality values.
PMID:29992288
Genomic big data hitting the storage bottleneck.
PMID:29782620
Diversity of fungi associated with roots of Calanthe orchid species in Korea.
PMID:29299843
CALQ: compression of quality values of aligned sequencing data.
PMID:29186284
Ensembl Genomes 2018: an integrated omics infrastructure for non-vertebrate species.
PMID:29092050
GeneComp, a new reference-based compressor for SAM files.
PMID:29046896
Traversing the k-mer Landscape of NGS Read Datasets for Quality Score Sparsification.
PMID:28825060
Alignment of 1000 Genomes Project reads to reference assembly GRCh38.
PMID:28531267
Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly.
PMID:28396521
The RNASeq-er API-a gateway to systematically updated analysis of public RNA-seq data.
PMID:28369191
LW-FQZip 2: a parallelized reference-based compression of FASTQ files.
PMID:28320326
Using reference-free compressed data structures to analyze sequencing reads from thousands of human genomes.
PMID:27986821
ascatNgs: Identifying Somatically Acquired Copy-Number Alterations from Whole-Genome Sequencing Data.
PMID:27930809
cgpCaVEManWrapper: Simple Execution of CaVEMan in Order to Detect Somatic Single Nucleotide Variants in NGS Data.
PMID:27930805
A privacy-preserving solution for compressed storage and selective retrieval of genomic data.
PMID:27789525
Comparison of high-throughput sequencing data compression tools.
PMID:27776113
A new algorithm for "the LCS problem" with application in compressing genome resequencing data.
PMID:27556803
Towards precision medicine.
PMID:27528417
Fourth Generation of Next-Generation Sequencing Technologies: Promise and Consequences.
PMID:27406789
Boiler: lossy compression of RNA-seq alignments using coverage vectors.
PMID:27298258
Recommendations on e-infrastructures for next-generation sequencing.
PMID:27267963
VariantBam: filtering and profiling of next-generational sequencing data using region-specific rules.
PMID:27153727
The challenges of big data.
PMID:27147249
CARGO: effective format-free compressed storage of genomic information.
PMID:27131376
Novel bioinformatic developments for exome sequencing.
PMID:27075447
Compressive mapping for next-generation sequencing.
PMID:27054987
The real cost of sequencing: scaling computation to keep pace with data generation.
PMID:27009100
Effect of lossy compression of quality scores on variant calling.
PMID:26966283
MetaCRAM: an integrated pipeline for metagenomic taxonomy identification and compression.
PMID:26895947
The European Bioinformatics Institute in 2016: Data growth and integration.
PMID:26673705
The International Nucleotide Sequence Database Collaboration.
PMID:26657633
Biological data sciences in genome research.
PMID:26430150
Reference-free compression of high throughput sequencing data with a probabilistic de Bruijn graph.
PMID:26370285
elPrep: High-Performance Preparation of Sequence Alignment/Map Files for Variant Calling.
PMID:26182406
Big Data: Astronomical or Genomical?
PMID:26151137
ERGC: an efficient referential genome compression algorithm.
PMID:26139636
Compression of Large genomic datasets using COMRAD on Parallel Computing Platform.
PMID:26124572
GDC 2: Compression of large collections of genomes.
PMID:26108279
LFQC: a lossless compression algorithm for FASTQ files.
PMID:26093148
Light-weight reference-based compression of FASTQ data.
PMID:26051252
QVZ: lossy compression of quality values.
PMID:26026138
Data-dependent bucketing improves reference-free compression of sequencing reads.
PMID:25910696
Quality score compression improves genotyping accuracy.
PMID:25748910
Extending reference assembly models.
PMID:25651527
Reference-based compression of short-read sequences using path encoding.
PMID:25649622
Streamlined Genome Sequence Compression using Distributed Source Coding.
PMID:25520552
The Essential Component in DNA-Based Information Storage System: Robust Error-Tolerating Module.
PMID:25414846
The Harvest suite for rapid core-genome alignment and visualization of thousands of intraspecific microbial genomes.
PMID:25410596
Aligned genomic data compression via improved modeling.
PMID:25395305
Fast lossless compression via cascading Bloom filters.
PMID:25252952
Tangled up in two: a burst of genome duplications at the end of the Cretaceous and the consequences for plant evolution.
PMID:24958926
The Scramble conversion tool.
PMID:24930138
Advances in genome studies in plants and animals.
PMID:24626952
ENZYMAP: exploiting protein annotation for modeling and predicting EC number changes in UniProt/Swiss-Prot.
PMID:24586563
XS: a FASTQ read simulator.
PMID:24433564
HUGO: Hierarchical mUlti-reference Genome cOmpression for aligned reads.
PMID:24368726
SRComp: short read sequence compression using burstsort and Elias omega coding.
PMID:24349065
DNA-COMPACT: DNA COMpression based on a pattern-aware contextual modeling technique.
PMID:24282536
The European Bioinformatics Institute's data resources 2014.
PMID:24271396
Compression of structured high-throughput sequencing data.
PMID:24260313
Data compression for sequencing data.
PMID:24252160
Human neuroimaging as a "Big Data" science.
PMID:24113873
Short read alignment with populations of genomes.
PMID:23813006
QualComp: a new lossy compressor for quality scores based on rate distortion theory.
PMID:23758828
Using Genome Query Language to uncover genetic variation.
PMID:23751181
Sequence squeeze: an open contest for sequence compression.
PMID:23596984
Computational solutions for omics data.
PMID:23594911
Existing and emerging technologies for tumor genomic profiling.
PMID:23589546
The future of DNA sequence archiving.
PMID:23587147
Compression of FASTQ and SAM format sequencing data.
PMID:23533605
Facing growth in the European Nucleotide Archive.
PMID:23203883
Adaptive efficient compression of genomes.
PMID:23146997
NGC: lossless and lossy compression of aligned high-throughput sequencing data.
PMID:23066097
SCALCE: boosting sequence compression algorithms using locally consistent encoding.
PMID:23047557
Compression of next-generation sequencing reads aided by highly efficient de novo assembly.
PMID:22904078
Compressive genomics.
PMID:22781691
Metagenomics - a guide from sampling to data analysis.
PMID:22587947
Genomics and privacy: implications of the new reality of closed data for the field.
PMID:22144881
GReEn: a tool for efficient compression of genome resequencing data.
PMID:22139935
Next-generation sequencing technologies and applications for human genetic history and forensics.
PMID:22115430
Major submissions tool developments at the European Nucleotide Archive.
PMID:22080548
The Sequence Read Archive: explosive growth of sequencing data.
PMID:22009675
ReCoil - an algorithm for compression of extremely large datasets of dna data.
PMID:21988957
Developing and implementing an institute-wide data sharing policy.
PMID:21955348
The real cost of sequencing: higher than you think!
PMID:21867570
SEED: efficient clustering of next-generation sequences.
PMID:21810899