
Study Reveals Genome-Wide Host–Virus Genetic Interactions in Cancer Risk
New research from Columbia University Mailman School of Public Health reports a major advance in understanding how interactions between human and viral genomes shape disease risk. The study found that variations in the Epstein–Barr virus, together with a specific immune-related gene (HLA-A*11:01), strongly influence the risk of nasopharyngeal cancer—affecting the upper part of the pharynx, connecting with the nasal cavity. The findings are published in Nature.
This study provides a large-scale genome-to-genome analysis of host–virus interactions in cancer risk. The study was co-led by Zhonghua Liu, ScD, assistant professor of Biostatistics at Columbia Mailman School and co-senior author, who led the statistical design and analysis.
Although EBV infects more than 95 percent of adults worldwide, only a small fraction develops EBV-associated cancers, making it critical to understand how host and viral factors jointly influence disease risk.
Rather than analyzing human or viral genetic variation separately, the researchers jointly examined human genome-wide association data alongside EBV whole-genome sequencing, enabling a systematic search for genetic interactions across both genomes.
Using this approach, the team identified a key interaction between the human HLA allele HLA-A*11:01 and a viral variant (SNP 85841G) in the EBV gene EBNA3B. The results show that cancer risk is shaped not by host genetics or viral strain alone, but by their combined effects. Individuals with susceptible HLA backgrounds infected with high-risk EBV strains carrying the 85841G variant face substantially elevated risk.
“Most genetic studies examine the host genome or the pathogen genome separately,” said Dr. Liu, who also leads the Causal Genomics Lab at Columbia. “By analyzing both genomes together using advanced statistical genetics methods and causal inference-inspired interaction frameworks, we can identify and quantify genetic interactions that would otherwise remain hidden.”
The study employed a stepwise analytical framework integrating statistical genetics with causal inference methods to detect and quantify host–virus interactions across the genome. The approach controlled for population structure in both human and viral genomes, as well as sample relatedness and multiple testing. Beyond statistical associations, functional experiments demonstrated that the viral mutation produces a peptide that can be presented by HLA-A*11:01, eliciting HLA-A*11:01-restricted CD8+ T-cell responses that can kill EBV-transformed B cells carrying 85841G strains.
Overall, the study underscores the growing role of statistical genomics and causal inference in modern biomedical discovery, particularly as researchers integrate multiple genomic systems to better understand complex diseases.
A complete list of authors and their institutions appear in the paper.
The authors declare no competing financial interests.
Media Contact
Stephanie Berger, sb2247@cumc.columbia.edu
