posted on 2024-05-01, 07:43authored byJi-Eun Park, Markia A. Smith, Sarah C. Van Alsten, Andrea Walens, Di Wu, Katherine A. Hoadley, Melissa A. Troester, Michael I. Love
This figure shows TCGA results with a different set of breast cancer signatures
Funding
National Institute of Environmental Health Sciences (NIEHS)
United States Department of Health and Human Services
Somatic mutational signatures elucidate molecular vulnerabilities to therapy, and therefore detecting signatures and classifying tumors with respect to signatures has clinical value. However, identifying the etiology of the mutational signatures remains a statistical challenge, with both small sample sizes and high variability in classification algorithms posing barriers. As a result, few signatures have been strongly linked to particular risk factors.
Here, we develop a statistical model, Diffsig, for estimating the association of one or more continuous or categorical risk factors with DNA mutational signatures. Diffsig takes into account the uncertainty associated with assigning signatures to samples as well as multiple risk factors' simultaneous effect on observed DNA mutations.
We applied Diffsig to breast cancer data to assess relationships between five established breast-relevant mutational signatures and etiologic variables, confirming known mechanisms of cancer development. In simulation, our model was capable of accurately estimating expected associations in a variety of contexts.
Diffsig allows researchers to quantify and perform inference on the associations of risk factors with mutational signatures.
We expect Diffsig to provide more robust associations of risk factors with signatures to lead to better understanding of the tumor development process and improved models of tumorigenesis.