2

Efficient Bayesian functional principal component analysis of irregularly-observed multivariate curves

This paper introduces a Bayesian hierarchical framework for multivariate functional principal component analysis (mFPCA) to handle irregularly and sparsely observed longitudinal data. The proposed model leverages shared functional principal component scores to pool information across correlated curves, providing parsimonious and interpretable summaries for follow-up analyses. Inference is based on a scalable and modular variational message passing algorithm which produces accurate uncertainty quantification. The method is applied to a COVID study conducted in Cambridge in 2020 and 2021; it is implemented in the R package bayesFPCA.

Scalable multiple network inference with the joint graphical horseshoe

This paper presents an expectation conditional maximisation algorithm for the graphical horseshoe estimator, which enhances scalability while maintaining accuracy. It also introduces a novel joint graphical horseshoe estimator that improves inference by sharing information across related networks, preserving their unique features. These methods, called fastGHS and jointGHS, enhance the modelling of complex network relationships, for instance when analysing large datasets in the context of biological research.

A modeling framework for detecting and leveraging node-level information in Bayesian network inference

This paper presents a Gaussian graphical modeling framework that enhances network inference by integrating node-level information. The method is scalable, thanks to a variational ECM algorithm, and is validated through simulations and a gene network study identifying hub genes in immune-related pathways.

A patient-centric modeling framework captures recovery from SARS-CoV-2 infection

The biology underlying individual responses to SARS-CoV-2 infection is not well understood. In this paper, we developed a functional principal component analysis framework using longitudinal data from 215 individuals with varying disease severities over a year after onset. Our analysis revealed distinct "systemic recovery" profiles that showed progression and resolution of inflammatory, immune cell, metabolic, and clinical responses, highlighting potential pathways affecting the restoration of homeostasis, death risk, and long COVID. Based on this data, we developed a composite signature predictive of systemic recovery and an online tool for prospective testing of our findings.

EPISPOT: An epigenome-driven approach for detecting and interpreting hotspots in molecular QTL studies

This paper introduces "epispot", a variational inference approach for Bayesian variable selection guided by predictor-level information. The method can be used to investigate the functional mechanisms underpinning QTL effects, leveraging large panels of epigenetic marks as variant-level information to enhance QTL mapping in regions of interest. It uses a top-level spike-and-slab regression submodel to couple QTL analysis with a hypothesis-free selection of biologically interpretable annotations which directly contribute to the QTL effects.

A fully joint Bayesian quantitative trait locus mapping of human protein abundance in plasma

This paper presents a fully multivariate proteomic quantitative trait locus (pQTL) analysis performed with our Bayesian method "locus" on data from two clinical cohorts, with plasma protein levels quantified by mass-spectrometry and aptamer-based assays.

A global-local approach for detecting hotspots in multiple-response regression

This paper introduces "atlasqtl", a method for Bayesian variable selection in regression problems with high-dimensional responses and candidate predictors, for use in genetic mapping with molecular traits. The approach implements a series of parallel yet hierarchically-related sparse regressions to (i) borrow information across up to tens of thousands of traits and (ii) model "hotspots", i.e., predictors associated with multiple traits, using a global-local prior formulation. Inference is carried out using an annealed variational procedure that allows efficient exploration of multimodal posterior spaces.

Efficient inference for genetic association studies with multiple outcomes

This paper introduces "locus", a method for sparse multivariate regression model that allows simultaneous selection of predictors and associated responses using variational inference.