Computational biology and bioinformatics articles within Nature Communications

Featured

  • Article
    | Open Access

    Motor learning is characterized by diverse cognitive processes, which lack a unified theoretical framework. Here, Takiyama et al.present a model demonstrating that motor learning is determined by prospective errors, which they test in a specially designed visuomotor adaptation task.

    • Ken Takiyama
    • , Masaya Hirashima
    •  & Daichi Nozaki
  • Article
    | Open Access

    Understanding the dynamics of enzyme-substrate complexation provides an insight into potential drugs, but intermediate states are difficult to observe experimentally. Here, the authors use simulations and machine learning to analyse the binding of transition state inhibitors to purine nucleoside phosphorylase.

    • Sergio Decherchi
    • , Anna Berteotti
    •  & Andrea Cavalli
  • Article
    | Open Access

    Understanding the epidemiology of malaria transmission between humans and mosquitoes is crucial for successful disease control. Analysing data from an 18-year malaria control programme, Churcher et al. show that decreased parasite prevalence in humans can be found concurrently with an increase in transmission efficiency.

    • Thomas S. Churcher
    • , Jean-François Trape
    •  & Anna Cohuet
  • Article |

    Body plan complexity is associated with the number of different cell types, yet the processes that create this diversity are unclear. Here the authors use transcriptomics to test the hypothesis that unlike cancer cells, novel normal cell types arise through sub-specialization of an ancestral cell type.

    • Cong Liang
    • , Alistair R.R. Forrest
    •  & Günter P. Wagner
  • Article |

    The structure of insect odorant receptors (ORs) has remained elusive due to their lack of homology to other proteins and the inability to obtain OR crystals. Here, the authors use amino acid evolutionary covariation patterns to fold these proteins de novoand generate the first three-dimensional models of insect ORs.

    • Thomas A. Hopf
    • , Satoshi Morinaga
    •  & Richard Benton
  • Article
    | Open Access

    Cellular imaging studies can generate large volumes of complex phenotypic data; however, presenting this information in a form that quickly conveys trends in the data set remains a challenge. Sailem et al.present a tool which translates such data into easily interpretable cell-like glyphs.

    • Heba Z. Sailem
    • , Julia E. Sero
    •  & Chris Bakal
  • Article
    | Open Access

    The correct assembly of genomes from sequencing data remains a challenge due to difficulties in correctly assigning the location of repeated DNA elements. Here the authors describe GRAAL, an algorithm that utilizes genome-wide chromosome contact data within a probabilistic framework to produce accurate genome assemblies.

    • Hervé Marie-Nelly
    • , Martial Marbouty
    •  & Romain Koszul
  • Article |

    microRNAs are short non-coding RNAs that post-transcriptionally regulate gene expression for which the identification of promoter and primary transcripts (pri-miRNAs) has been difficult. Here the authors describe microTSS, an algorithm that supports the precise identification of intergenic pri-miRNA transcription start sites.

    • Georgios Georgakilas
    • , Ioannis S. Vlachos
    •  & Artemis G. Hatzigeorgiou
  • Article |

    Metabolites produced by the gut microbiota can potentially affect our physiology. Here, the authors present a metabolomics strategy that models microbiota metabolism as a reaction network and uses pathway analysis to facilitate identification and characterization of microbial metabolites.

    • Gautham V. Sridharan
    • , Kyungoh Choi
    •  & Arul Jayaraman
  • Article |

    A large portion of the transcribed genome—such as introns and noncoding RNAs—is believed to not be translated into protein products. Here, the authors provide evidence for the existence of regulated peptide products that are translated from transcribed sequences generally characterized as noncoding.

    • Sudhakaran Prabakaran
    • , Martin Hemberg
    •  & Judith A. Steen
  • Article
    | Open Access

    CpG islands are high GC content DNA elements that surround the majority of transcriptional start sites in eukaryotes. Here, the authors analyse over 200 genomic data sets to provide new insight into global CpG islands-dependent regulatory mechanisms in differentiated and pluripotent stem cells.

    • Samuel Beck
    • , Bum-Kyu Lee
    •  & Jonghwan Kim
  • Article |

    Despite our growing understanding of their complexity, different types of RNA are still classified using technical rather than functional criteria. Andersson et al.show that categorization of RNAs based on stability and direction of transcription is an effective means of functional classification.

    • Robin Andersson
    • , Peter Refsing Andersen
    •  & Albin Sandelin
  • Article |

    The development of software tools to analyse large mass spectrometry data sets lags behind the increase in diversity of the data. Here the authors develop MS-GF+, a database search tool that outperforms other popular tools in identifying peptides from a variety of data sets.

    • Sangtae Kim
    •  & Pavel A. Pevzner
  • Article
    | Open Access

    No experimental evidence exists for intra-helical motion of DNA at the μs timescale, which has been attributed to technical difficulties in observing motion in this time range. Here, the authors demonstrate, using extensive molecular dynamics simulations and experimental analysis, that such motion is effectively absent from a B-DNA duplex.

    • Rodrigo Galindo-Murillo
    • , Daniel R. Roe
    •  & Thomas E. Cheatham III
  • Article
    | Open Access

    Metastasizing tumour cells undergo epithelial-to-mesenchymal transition. Using both bioinformatic and in vivo approaches, Chanrion et al.identify combined Notch activation and p53 inactivation as a potent inducer of this transition, and apply this to create a highly metastatic tumour model in mice.

    • Maia Chanrion
    • , Inna Kuperstein
    •  & Sylvie Robine
  • Article
    | Open Access

    Linear mixed models (LMMs) provide a powerful method for studying genotype–phenotype associations. Here the authors present a LMM application that estimates an optimal transformation from observed data and increases the accuracy of heritability estimation and phenotype prediction.

    • Nicolo Fusi
    • , Christoph Lippert
    •  & Oliver Stegle
  • Article |

    The functional consequences of naturally occurring variation in ribosomal DNA (rDNA) copy number are poorly understood. Here the authors estimate rDNA copy number and mitochondrial DNA abundance in humans using whole-genome short-read DNA sequencing and characterize global regulatory mechanisms for cellular homeostasis and adaptation.

    • John G. Gibbons
    • , Alan T. Branco
    •  & Bernardo Lemos
  • Article
    | Open Access

    Common methods to detect adenosine-to-inosine RNA editing sites rely on mapping short RNA reads to the genome while allowing only a limited number of mismatches. Here, Porath et al. present a novel RNA-seq based approach to identify hyper-edited reads that significantly expands the RNA editome.

    • Hagit T. Porath
    • , Shai Carmi
    •  & Erez Y. Levanon
  • Article
    | Open Access

    Metagenomic studies of microbial communities often report DNA sequences from unidentified viruses. Here, Dutilh et al. analyse metagenomic data to reveal the complete genome of an abundant, ubiquitous virus from human faeces, and predict that the virus infects bacteria of the Bacteroides group.

    • Bas E. Dutilh
    • , Noriko Cassman
    •  & Robert A. Edwards
  • Article
    | Open Access

    Intestinal microbes can have important effects on our health. Here, the authors analyse the gut microbiota composition in 1,000 western adults and find that certain bacteria are either abundant or nearly absent, and that these alternative states are associated with ageing and overweight.

    • Leo Lahti
    • , Jarkko Salojärvi
    •  & Willem M. de Vos
  • Article
    | Open Access

    Some viruses are spherical particles in which protein components are organized with well-defined icosahedral and local symmetries. Here, Gipson et al. describe a unique arrangement of proteins, breaking all expected local symmetries, in particles of a marine bacterial virus.

    • Preeti Gipson
    • , Matthew L. Baker
    •  & Wah Chiu
  • Article
    | Open Access

    Dyslipidemia and obesity have a high prevalence in populations with Amerindian backgrounds, such as Mexican–Americans. Here, the authors design an approach to identify Amerindian risk genes in Mexicans and identify five genomic loci, which include RORA and SIK3that may contribute to the risk of dyslipidemia and obesity in Amerindian populations.

    • Arthur Ko
    • , Rita M. Cantor
    •  & Päivi Pajukanta
  • Article |

    Analyses of genome and transcriptome data are unable to accurately predict protein levels and function in tumour samples. Here, the authors carry out a comprehensive protein analysis in 3,467 samples from the cancer genome atlas, providing a resource to study the prognostic and therapeutic potential of tumour proteins.

    • Rehan Akbani
    • , Patrick Kwok Shing Ng
    •  & Gordon B. Mills
  • Article |

    Model-based part design is a key step in synthetic biology. Here, the authors report a method for tuning nucleosome architecture in order to strengthen native promoters and facilitate synthetic promoter design in yeast.

    • Kathleen A. Curran
    • , Nathan C. Crook
    •  & Hal S. Alper
  • Article
    | Open Access

    Misfolded protein accumulation is a hallmark of many neurodegenerative diseases. Here Budrikis et al. model protein aggregation in the endoplasmic reticulum and show that it is the result of a non-equilibrium phase transition caused by tipping the balance from the rates of protein production to degradation.

    • Zoe Budrikis
    • , Giulio Costantini
    •  & Stefano Zapperi
  • Article |

    The enzyme butyrylcholinesterase (BChE) can metabolize cocaine, albeit at relatively low speeds. Here the authors use computational methods to define mutations that increase BChE-mediated cocaine hydrolysis, achieving a catalytic activity comparable to that of one of the fastest naturally occurring enzyme.

    • Fang Zheng
    • , Liu Xue
    •  & Chang-Guo Zhan
  • Article
    | Open Access

    Gene expression is highly variable between tissues, and changes during development and with age. Here, the authors provide a comprehensive RNA-Seq analysis of the rat transcriptome, spanning eleven organs, four developmental stages and both sexes.

    • Ying Yu
    • , James C. Fuscoe
    •  & Charles Wang
  • Article
    | Open Access

    mRNA transport contributes to the proper localization of its cognate proteins. Here the authors report a correlation between the physicochemical properties of mRNAs and their cognate proteins, suggesting that these properties of mRNAs can predict the subcellular localization of their cognate proteins.

    • Anton A. Polyansky
    • , Mario Hlevnjak
    •  & Bojan Zagrovic
  • Article |

    Predicting the dynamics and disorder of a protein is a computationally complex task that, until now, has depended on prior knowledge of protein structure. Cilia et al.develop a tool to rapidly predict protein backbone dynamics based on sequence alone.

    • Elisa Cilia
    • , Rita Pancsa
    •  & Wim F. Vranken
  • Article |

    Non-small cell lung cancers (NSCLC) that harbour mutations in KRas can be separated into KRas-dependent and -independent subsets. By analysing transcriptome, proteome and phosphoproteome data from NSCLC cell lines, Balbin et al. show that KRas-dependent cell lines activate the Lck pathway.

    • O. Alejandro Balbin
    • , John R. Prensner
    •  & Arul M. Chinnaiyan
  • Article
    | Open Access

    FGFR2 gene variation is associated with breast cancer risk but the molecular mechanism is unknown. Fletcher et al. provide a link between FGFR2 signalling and breast cancer susceptibility by demonstrating that FGFR2 signalling activates the ERa transcriptional network, which drives transcription of risk genes.

    • Michael N. C. Fletcher
    • , Mauro A. A. Castro
    •  & Kerstin B. Meyer
  • Article |

    Mutually exclusive splicing of genes is a mechanism for generating proteome diversity. Here Kollmar et al. determine the mutually exclusive spliced exome of Drosophila melanogaster and reveal insights into its evolutionary history within the Drosophilagroup.

    • Klas Hatje
    •  & Martin Kollmar
  • Article
    | Open Access

    Dynamic changes in T cell repertoire underlie immune responses during infection, allergy, autoimmunity and cancer. Here, Li et al. present a workflow for high throughput sequencing and analysis of T cell receptor sequences, and use it to monitor the T cell response to influenza vaccination in a human patient.

    • Shuo Li
    • , Marie-Paule Lefranc
    •  & Eric J. Gowans
  • Article |

    Sequencing whole microbial genomes has become standard practice and methods to examine their phylogenetic relationships need to match the increasing demand. Segata et al. present a new computational pipeline that allows fast and accurate taxonomic assignment of microbial species.

    • Nicola Segata
    • , Daniela Börnigen
    •  & Curtis Huttenhower
  • Article
    | Open Access

    Biological network data are often incomplete, which makes it difficult to determine interaction motifs within such data sets. Here Tran et al. present a new method to count motif numbers in large networks from noisy and incomplete biological data.

    • Ngoc Hieu Tran
    • , Kwok Pui Choi
    •  & Louxin Zhang