TY - JOUR
T1 - HAMdetector
T2 - A Bayesian regression model that integrates information to detect HLA-associated mutations
AU - Habermann, Daniel
AU - Kharimzadeh, Hadi
AU - Walker, Andreas
AU - Li, Yang
AU - Yang, Rongge
AU - Kaiser, Rolf
AU - Brumme, Zabrina L.
AU - Timm, Jörg
AU - Roggendorf, Michael
AU - Hoffmann, Daniel
N1 - Publisher Copyright:
© 2022 The Author(s). Published by Oxford University Press. All rights reserved.
PY - 2022/5/1
Y1 - 2022/5/1
N2 - Motivation: A key process in anti-viral adaptive immunity is that the human leukocyte antigen (HLA) system presents epitopes as major histocompatibility complex I (MHC I) protein-peptide complexes on cell surfaces and in this way alerts CD8+ cytotoxic T-lymphocytes (CTLs). This pathway exerts strong selection pressure on viruses, favoring viral mutants that escape recognition by the HLA/CTL system. Naturally, such immune escape mutations often emerge in highly variable viruses, e.g. HIV or HBV, as HLA-associated mutations (HAMs), specific to the hosts MHC I proteins. The reliable identification of HAMs is not only important for understanding viral genomes and their evolution, but it also impacts the development of broadly effective anti-viral treatments and vaccines against variable viruses. By their very nature, HAMs are amenable to detection by statistical methods in paired sequence/HLA data. However, HLA alleles are very polymorphic in the human host population which makes the available data relatively sparse and noisy. Under these circumstances, one way to optimize HAM detection is to integrate all relevant information in a coherent model. Bayesian inference offers a principled approach to achieve this. Results: We present a new Bayesian regression model for the detection of HAMs that integrates a sparsity-inducing prior, epitope predictions and phylogenetic bias assessment, and that yields easily interpretable quantitative information on HAM candidates. The model predicts experimentally confirmed HAMs as having high posterior probabilities, and it performs well in comparison to state-of-the-art models for several datasets from individuals infected with HBV, HDV and HIV.
AB - Motivation: A key process in anti-viral adaptive immunity is that the human leukocyte antigen (HLA) system presents epitopes as major histocompatibility complex I (MHC I) protein-peptide complexes on cell surfaces and in this way alerts CD8+ cytotoxic T-lymphocytes (CTLs). This pathway exerts strong selection pressure on viruses, favoring viral mutants that escape recognition by the HLA/CTL system. Naturally, such immune escape mutations often emerge in highly variable viruses, e.g. HIV or HBV, as HLA-associated mutations (HAMs), specific to the hosts MHC I proteins. The reliable identification of HAMs is not only important for understanding viral genomes and their evolution, but it also impacts the development of broadly effective anti-viral treatments and vaccines against variable viruses. By their very nature, HAMs are amenable to detection by statistical methods in paired sequence/HLA data. However, HLA alleles are very polymorphic in the human host population which makes the available data relatively sparse and noisy. Under these circumstances, one way to optimize HAM detection is to integrate all relevant information in a coherent model. Bayesian inference offers a principled approach to achieve this. Results: We present a new Bayesian regression model for the detection of HAMs that integrates a sparsity-inducing prior, epitope predictions and phylogenetic bias assessment, and that yields easily interpretable quantitative information on HAM candidates. The model predicts experimentally confirmed HAMs as having high posterior probabilities, and it performs well in comparison to state-of-the-art models for several datasets from individuals infected with HBV, HDV and HIV.
UR - http://www.scopus.com/inward/record.url?scp=85130036222&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btac134
DO - 10.1093/bioinformatics/btac134
M3 - Article
C2 - 35238330
AN - SCOPUS:85130036222
SN - 1367-4803
VL - 38
SP - 2428
EP - 2436
JO - Bioinformatics
JF - Bioinformatics
IS - 9
ER -