TY - JOUR
T1 - msqrob2PTM
T2 - Differential Abundance and Differential Usage Analysis of MS-Based Proteomics Data at the Posttranslational Modification and Peptidoform Level
AU - Demeulemeester, Nina
AU - Gébelin, Marie
AU - Gomes, Lucas Caldi
AU - Lingor, Paul
AU - Carapito, Christine
AU - Martens, Lennart
AU - Clement, Lieven
N1 - Publisher Copyright:
© 2023 THE AUTHORS.
PY - 2024/2
Y1 - 2024/2
N2 - In the era of open-modification search engines, more posttranslational modifications than ever can be detected by LC-MS/MS-based proteomics. This development can switch proteomics research into a higher gear, as PTMs are key in many cellular pathways important in cell proliferation, migration, metastasis, and aging. However, despite these advances in modification identification, statistical methods for PTM-level quantification and differential analysis have yet to catch up. This absence can partly be explained by statistical challenges inherent to the data, such as the confounding of PTM intensities with its parent protein abundance. Therefore, we have developed msqrob2PTM, a new workflow in the msqrob2 universe capable of differential abundance analysis at the PTM and at the peptidoform level. The latter is important for validating PTMs found as significantly differential. Indeed, as our method can deal with multiple PTMs per peptidoform, there is a possibility that significant PTMs stem from one significant peptidoform carrying another PTM, hinting that it might be the other PTM driving the perceived differential abundance. Our workflows can flag both differential peptidoform abundance (DPA) and differential peptidoform usage (DPU). This enables a distinction between direct assessment of differential abundance of peptidoforms (DPA) and differences in the relative usage of peptidoforms corrected for corresponding protein abundances (DPU). For DPA, we directly model the log2- transformed peptidoform intensities, while for DPU, we correct for parent protein abundance by an intermediate normalization step which calculates the log2-ratio of the peptidoform intensities to their summarized parent protein intensities. We demonstrated the utility and performance of msqrob2PTM by applying it to datasets with known ground truth, as well as to biological PTM-rich datasets. Our results show that msqrob2PTM is on par with, or surpassing the performance of, the current state-of-theart methods. Moreover, msqrob2PTM is currently unique in providing output at the peptidoform level.
AB - In the era of open-modification search engines, more posttranslational modifications than ever can be detected by LC-MS/MS-based proteomics. This development can switch proteomics research into a higher gear, as PTMs are key in many cellular pathways important in cell proliferation, migration, metastasis, and aging. However, despite these advances in modification identification, statistical methods for PTM-level quantification and differential analysis have yet to catch up. This absence can partly be explained by statistical challenges inherent to the data, such as the confounding of PTM intensities with its parent protein abundance. Therefore, we have developed msqrob2PTM, a new workflow in the msqrob2 universe capable of differential abundance analysis at the PTM and at the peptidoform level. The latter is important for validating PTMs found as significantly differential. Indeed, as our method can deal with multiple PTMs per peptidoform, there is a possibility that significant PTMs stem from one significant peptidoform carrying another PTM, hinting that it might be the other PTM driving the perceived differential abundance. Our workflows can flag both differential peptidoform abundance (DPA) and differential peptidoform usage (DPU). This enables a distinction between direct assessment of differential abundance of peptidoforms (DPA) and differences in the relative usage of peptidoforms corrected for corresponding protein abundances (DPU). For DPA, we directly model the log2- transformed peptidoform intensities, while for DPU, we correct for parent protein abundance by an intermediate normalization step which calculates the log2-ratio of the peptidoform intensities to their summarized parent protein intensities. We demonstrated the utility and performance of msqrob2PTM by applying it to datasets with known ground truth, as well as to biological PTM-rich datasets. Our results show that msqrob2PTM is on par with, or surpassing the performance of, the current state-of-theart methods. Moreover, msqrob2PTM is currently unique in providing output at the peptidoform level.
UR - http://www.scopus.com/inward/record.url?scp=85185835770&partnerID=8YFLogxK
U2 - 10.1016/j.mcpro.2023.100708
DO - 10.1016/j.mcpro.2023.100708
M3 - Article
C2 - 38154689
AN - SCOPUS:85185835770
SN - 1535-9476
VL - 23
JO - Molecular and Cellular Proteomics
JF - Molecular and Cellular Proteomics
IS - 2
M1 - 100708
ER -