Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain
the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in
Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles
and JavaScript.
Genome-wide association studies allow connecting genomic information with complex traits. Rodrigo Bonazzola et al. develop a framework consisting of several deep learning tools to improve the discoverability of genes that influence specific geometric features of the heart.
Visual representations are thought to develop from visual experience and inductive biases. Orhan and Lake show that modern machine learning algorithms can learn visual knowledge from a few hundred hours of longitudinal headcam recordings collected from young children during the course of early development, without strong inductive biases.
This Reusability Report examines a recently published deep learning method PENCIL by Ren et al. for identifying phenotype populations in single-cell data. Cao et al. reproduce here the main results, analyse the sensitivity of the method to model parameters and describe how the method can be used to create a signature for immunotherapy response markers.
Mutations can increase or decrease a protein’s ability to bind to other proteins, but modelling multiple mutations becomes computationally intractable. Lan and colleagues propose an adversarial deep learning architecture to guide the choice of mutations to optimize binding affinities.
Machine learning methods have made great advances in modelling protein sequences for a variety of downstream tasks. The representation used as input for these models has been primarily the sequence of amino acids. Outeiral and Deane show that using codon sequences instead can improve protein representations and lead to model performance.
Algorithmic decisions have a history of harming already marginalized populations. In an effort to combat these discriminative patterns, data-driven methods are used to comprehend these patterns, and recently also to identify disadvantaged communities to allocate resources. Huynh et al. analyse one of these tools and show a concerning sensitivity to input parameters that can lead to unintentional biases with substantial financial consequences.
A parameterized physical model that uses unpaired datasets for adaptive holographic imaging was published in Nature Machine Intelligence in 2023. Zhang and colleagues evaluate its performance and extend it to non-perfect optical systems by integrating specific optical response functions.
Deep learning language models have proved useful for both natural language and protein modelling. Similar to semantics in natural language, protein functions are complex and depend on the context of their environment, rather than on the similarity of sequences. Kulmanov and colleagues present an approach to frame function prediction as semantic entailment using a neuro-symbolic model to augment a large protein language model.
Denoising low-counting statistics data in the presence of multiple, unknown noise profiles is a challenging task in scientific applications where high accuracy is required. Oppliger and colleagues train a deep convolutional neural network on pairs of experimental low- and high-noise X-ray diffraction data and demonstrate better performance on experimental noise filtering compared with the case of training on artificial data pairs.
Realistic quantum mechanical simulations are computationally costly to perform but can be approximated using neural network models. Li and colleagues propose a forward propagation method in lieu of traditional backpropagation to speed up these neural network-based approaches.
Great advances in protein structure prediction have been made with recent deep learning-based methods, but proteins interact with their environment and can change shape drastically when binding to ligand molecules. To predict the 3D structure of these combined protein–ligand complexes, Qiao et al. developed a generative diffusion model with biophysical constraints and geometric deep learning.
AI-enabled diagnostic applications in healthcare can be powerful, but study design is very important to avoid subtle issues of bias in the dataset and evaluation. Coppock et al. demonstrate how an AI-based classifier for diagnosing SARS-Cov-2 infection from audio recordings can seem to make predictions with high accuracy but shows much lower performance after taking into account confounders, providing insights in study design and replicability in AI-based audio analysis.
Machine learning techniques are widely employed in chemical science, but are application specific and their development requires dedicated expertise. Jablonka and colleagues fine-tune the GPT-3 model and show that it can provide surprisingly accurate answers to a wide range of chemical questions.
Recent years have seen many advances in deep learning models for protein design, usually involving a large amount of training data. Focusing on potential clinical impact, Garton et al. develop a variational autoencoder approach trained on sparse data of natural sequences of adenoviruses to generate large proteins that can be used as viral vectors in gene therapy.
Designing antibodies and assessing their biophysical properties for potential therapeutic development is challenging with current computational methods. Ramon et al. have developed a deep learning approach called AbNatiV, based on a vector-quantized variational encoder that accurately assesses the nativeness of antibodies and nanobodies, which are small single-domain antibodies that have recently attracted considerable interest.
Drug design has recently seen immense improvements in computational methods, but models can still struggle generalizing across binding pockets. Feng and colleagues combine a language model with geometric deep learning to provide efficient generation of potential new drugs.
Accurate real-time tracking of dexterous hand movements and interactions has applications in human–computer interaction, the metaverse, robotics and tele-health. Capturing realistic hand movements is challenging due to the large number of articulations and degrees of freedom. Tashakori and colleagues report accurate and dynamic tracking of articulated hand and finger movements using machine-learning powered stretchable, washable smart gloves.
Magnetic microrobots are of considerable interest for non-invasive biomedical applications but it is challenging to develop a general strategy for controlling microrobot positions, for varying configurations and environments. Choi et al. develop a reinforcement learning control method, training the model in a simulation environment for initial exploration after which the learning process is transferred to a physical electromagnetic actuation system.
Multi-animal behaviour quantification is pivotal for deciphering animal social behaviours and has broad applications in neuroscience and ecology. Han and colleagues develop a few-shot learning framework for multi-animal 3D pose estimation, identity recognition and social behaviour classification.
Feed-forward neural networks have become powerful tools in machine learning, but their behaviour during optimization is still not well understood. Ciceri and colleagues find that during optimization, class representations first separate and then rejoin, prompted by specific elements of the training set.