Genetic databases articles within Nature Communications

Featured

  • Article
    | Open Access

    Accurately benchmarking small variant calling accuracy is critical for the continued improvement of human genome sequencing. Here, the authors show that current approaches are biased towards certain variant representations and develop a new approach to ensure consistent and accurate benchmarking, regardless of the original variant representations.

    • Tim Dunn
    •  & Satish Narayanasamy
  • Article
    | Open Access

    Rare Mendelian disorders pose a major diagnostic challenge, but evaluation of automated tools that aim to uncover causal genes tools is limited. Here, the authors present a computational pipeline that simulates realistic clinical datasets to address this deficit.

    • Emily Alsentzer
    • , Samuel G. Finlayson
    •  & Isaac S. Kohane
  • Article
    | Open Access

    Research aimed at improving healthcare has largely focused on male animals and cells. Here, the authors use data from the International Mouse Phenotyping Consortium to show that body weight does not account for all phenotypic differences between male and female mice, supporting more female-focused research.

    • Laura A. B. Wilson
    • , Susanne R. K. Zajitschek
    •  & Shinichi Nakagawa
  • Article
    | Open Access

    Studies of cell heterogeneity in white matter in primates have been limited to date. Here the authors describe a marmoset brain cell atlas that bridges rodent and human data, revealing strong gray-white matter glial segregation.

    • Jing-Ping Lin
    • , Hannah M. Kelly
    •  & Daniel S. Reich
  • Article
    | Open Access

    A comprehensive data portal to explore plant regulomes is still unavailable. Here, the authors develop a web-based platform ChIP-Hub in the ENCODE standards and demonstrate its applications in the identification of hierarchical regulatory network, tissue-specific chromatin dynamics, putative enhancers and chromatin states.

    • Liang-Yu Fu
    • , Tao Zhu
    •  & Dijun Chen
  • Article
    | Open Access

    This paper describes the ‘4DN Data Portal’ that hosts data generated by the 4D Nucleome network, including Hi-C and other chromatin conformation capture assays, as well as various sequencing-based and imaging-based assays. Raw data have been uniformly processed to increase comparability and the portal is implemented with visualization tools to browse the data without download.

    • Sarah B. Reiff
    • , Andrew J. Schroeder
    •  & Peter J. Park
  • Comment
    | Open Access

    Ensuring international benefit-sharing from sequence data without jeopardising open sharing is a major obstacle for the Convention on Biological Diversity and other UN negotiations. Here, the authors propose a solution to address the concerns of both developing countries and life scientists.

    • Amber Hartman Scholz
    • , Jens Freitag
    •  & Jörg Overmann
  • Article
    | Open Access

    Local gene co-expression is found throughout the genome, but systematic analysis of these co-expressed genes is needed. Here, the authors identify local co-expressed genes in 49 tissues and characterize the genetic variants which may affect their expression and contribute to disease.

    • Diogo M. Ribeiro
    • , Simone Rubinacci
    •  & Olivier Delaneau
  • Article
    | Open Access

    Single-nucleotide variants in enhancers or promoters may affect gene transcription by altering transcription factor binding sites. Here the authors present a meta-analysis empowered by a new statistical method covering thousands of ChIP-Seq experiments resulting in the identification of more than 500 thousand allele-specific binding (ASB) events in the human genome.

    • Sergey Abramov
    • , Alexandr Boytsov
    •  & Ivan V. Kulakovskiy
  • Article
    | Open Access

    With the generation of large pan-cancer whole-exome and whole-genome sequencing projects, a question remains about how comparable these datasets are. Here, using The Cancer Genome Atlas samples analysed as part of the Pan-Cancer Analysis of Whole Genomes project, the authors explore the concordance of mutations called by whole exome sequencing and whole genome sequencing techniques.

    • Matthew H. Bailey
    • , William U. Meyerson
    •  & Christian von Mering
  • Article
    | Open Access

    Schulz et al. systematically benchmark performance scaling with increasingly sophisticated prediction algorithms and with increasing sample size in reference machine-learning and biomedical datasets. Complicated nonlinear intervariable relationships remain largely inaccessible for predicting key phenotypes from typical brain scans.

    • Marc-Andre Schulz
    • , B. T. Thomas Yeo
    •  & Danilo Bzdok
  • Article
    | Open Access

    Most databases of genotype-phenotype associations are manually curated. Here, Kuleshov et al. describe a machine curation system that extracts such relationships from the GWAS literature and synthesizes them into a structured knowledge base called GWASkb that can complement manually curated databases.

    • Volodymyr Kuleshov
    • , Jialin Ding
    •  & Michael Snyder
  • Article
    | Open Access

    Short-tandem repeats (STR), similar to single nucleotide polymorphisms (SNP), contribute to complex traits, but their ascertainment by next-generation sequencing is costly. Here, Saini et al. provide a SNP+STR haplotype reference panel that allows imputation of STRs from SNP array data.

    • Shubham Saini
    • , Ileena Mitra
    •  & Melissa Gymrek
  • Article
    | Open Access

    Here, Libertini and colleagues devise a computation tool that can analyze whole-genome bisulfite sequencing (WGBS) data to recover of ∼30% of the lost differential methylation position information. They use COMETgazer and COMETvintage to analyze 13 diffferent methylome data to demonstrate their performance.

    • Emanuele Libertini
    • , Simon C. Heath
    •  & Stephan Beck