In March 2020, thousands of scientists around the world united to answer a pressing and complex question: what genetic factors influence why some COVID-19 patients develop severe, life-threatening disease requiring hospitalization, while others escape with mild symptoms or none at all?
A comprehensive summary of their findings to date, published in Nature, reveals 13 loci, or locations in the human genome, that are strongly associated with infection or severe COVID-19. The researchers also identified causal factors such as smoking and high body mass index. These results come from one of the largest genome-wide association studies ever performed, which includes nearly 50,000 COVID-19 patients and two million uninfected controls.
The findings could help provide targets for future therapies and illustrate the power of genetic studies in learning more about infectious disease.
This global effort, called the COVID-19 Host Genomics Initiative, was founded in March 2020 by Andrea Ganna, group leader at the Institute for Molecular Medicine Finland (FIMM), University of Helsinki and Mark Daly, director of FIMM and institute member at the Broad Institute of MIT and Harvard. The initiative has grown to be one of the most extensive collaborations in human genetics and currently includes more than 3,300 authors and 61 studies from 25 countries.
Ben Neale, co-director of the Program in Medical and Population Genetics at the Broad and co-senior author of the study, said that while vaccines confer protection against infection, there is still substantial room for improvement in COVID-19 treatment, which can be informed by genetic analysis. He added that improving treatment approaches could help shift the pandemic -- which has necessitated large shutdowns in much of the world -- to an endemic disease that is more localized and present at low but consistent levels in the population, much like the flu.
"The better we get at treating COVID-19, the better equipped the medical community could be to manage the disease," he said. "If we had a mechanism of treating infection and getting someone out of the hospital, that would radically alter our public health response."
To conduct their analysis, the consortium pooled clinical and genetic data from the nearly 50,000 patients in their study who tested positive for the virus, and 2 million controls across numerous biobanks, clinical studies, and direct-to-consumer genetic companies such as 23andMe. Because of the large amount of data pouring in from around the world, the scientists were able to produce statistically robust analyses far more quickly, and from a greater diversity of populations, than any one group could have on its own.
Of the 13 loci identified so far by the team, two had higher frequencies among patients of East Asian or South Asian ancestry than in those of European ancestry, underscoring the importance of diversity in genetic datasets.
"We've been much more successful than past efforts in sampling genetic diversity because we've made a concerted effort to reach out to populations around the world," said Daly. "I think we still have a long way to go, but we're making very good progress."
The team highlighted one of these two loci in particular, near the FOXP4 gene, which is linked to lung cancer. The FOXP4 variant associated with severe COVID-19 increases the gene's expression, suggesting that inhibiting the gene could be a potential therapeutic strategy. Other loci associated with severe COVID-19 included DPP9, a gene also involved in lung cancer and pulmonary fibrosis, and TYK2, which is implicated in some autoimmune diseases.
Mari Niemi, also at FIMM and lead analyst for the study, says the consortium prioritized communication as the scientists analyzed data, immediately releasing results on their website after they had been checked for accuracy. The team hopes their results might point the way to useful targets for repurposed drugs.
The researchers will continue to study more data as they come in and update their results through the "Matters Arising" format at Nature. They will begin to study what differentiates "long-haulers", or patients whose COVID-19 symptoms persist for months, from others, and continue to identify additional loci associated with infection and severe disease.
"We'd like to aim to get a good handful of very concrete therapeutic hypotheses in the next year," Daly said. "Realistically, we will most likely be addressing COVID-19 as a serious health concern for a long time. Any therapeutic that emerges this year, for example from repurposing an existing drug based on clear genetic insights, would have a great impact."
Ganna emphasized that the scientists were able to find robust genetic signals because of their collaborative efforts, a cohesive spirit of data-sharing and transparency, and the urgency that comes with knowing the entire world faces the same threat at the same time. He added that geneticists, who regularly work with large datasets, have known the benefits of open collaboration for a long time. "This only illustrates just how much better science is -- how much faster it goes and how much more we discover -- when we work together," Ganna said.
Daly, for his part, is excited by how clear and interpretable their results have been for geneticists. He says the insights from this work have been unique and potentially paradigm-shifting for the field of human genetics, which has been dominated by studies of common chronic diseases, rare genetic diseases, and cancer.
"These discoveries have been really informative and that has made us realize that there's a lot of untapped potential in using genetics to understand and potentially develop therapeutics for infectious disease," Daly said. "I hope this sets an example for how we might bring population genetics approaches to a new set of problems that are especially important in developing parts of the world."
Source: Broad Institute of MIT and Harvard