HIV-phyloTSI: subtype-independent estimation of time since HIV-1 infection for cross-sectional measures of population incidence using deep sequence data
Citation
Golubchik T, Abeler-Dörner L, Hall M, Wymant C, Bonsall D, Macintyre-Cockett G, Thomson L, Baeten JM, Celum CL, Galiwango RM, Kosloff B, Limbada M, Mujugira A, Mugo NR, Gall A, Blanquart F, Bakker M, Bezemer D, Ong SH, Albert J, Bannert N, Fellay J, Gunse. HIV-phyloTSI: subtype-independent estimation of time since HIV-1 infection for cross-sectional measures of population incidence using deep sequence data. BMC Bioinformatics. 2025, 26: 212. PMC12351810
Abstract
Background: Estimating the time since HIV infection (TSI) at population level is essential for tracking changes in the global HIV epidemic. Most methods for determining TSI give a binary classification of infections as recent or non-recent within a window of several months, and cannot assess the cumulative impact of an intervention. Results: We developed a Random Forest Regression model, HIV-phyloTSI, which combines measures of within-host diversity and divergence to generate continuous TSI estimates directly from viral deep-sequencing data, with no need for additional variables. HIV-phyloTSI provides a continuous measure of TSI up to 9 years, with a mean absolute error of less than 12 months overall and less than 5 months for infections with a TSI of up to a year. It performs equally well for all major HIV subtypes based on data from African and European cohorts. Conclusions: We demonstrate how HIV-phyloTSI can be used for incidence estimates on a population level.