News

How disease sleuths are using genomics to track the coronavirus

Rapid sequencing of viral genomes can help public health officials figure out the origins, spread and nature of quickly moving epidemics
Bob Holmes
By Bob Holmes
May 3, 2020

In the early stages of a pandemic like Covid-19, public health officials need a lot of answers fast. How quickly is the virus spreading, and through which routes? How can we contain it? And when can we safely relax the most stringent control measures such as shelter-in-place?

Answering those questions is never easy, but in the face of the new coronavirus, epidemiologists have a powerful tool that wasn’t available for the earlier SARS and MERS epidemics (also caused by coronaviruses): rapid, large-scale sequencing of viral genomes. These genetic sequences from viruses that have infected patients, together with old-fashioned tracing of personal contacts, allow health officials to track the spread of a virus from person to person and place to place faster and more accurately than ever before. That speed, they hope, will translate into earlier control of the virus, and more precise management of the pandemic’s end stages.

Geneticists have been able to sequence viral genomes for decades, of course — but the latest advances in the technology mean they can now do so in a matter of hours or days. Just as quickly, scientists around the world can share what they learn via a global open-source network known as Nextstrain. That speed and cooperation have been a game-changer, enabling this “genomic epidemiology” to be used in real time as the Covid-19 pandemic unfolds.

“We have used genomic epidemiology in other contexts where we were getting sequence in a month or a few weeks, but we’ve never had anything where we’ve had such fast turnaround or the number of sequences being shared from so many places so quickly,” says Emma Hodcroft, a genetic epidemiologist at the University of Basel in Switzerland and member of the Nextstrain network.

G-viral-genome-clues-alt-890x548.jpg
S. WOHL ET AL / AR VIROLOGY 2016 / KNOWABLE MAGAZINE
Using genome sequences, researchers can deduce evolutionary relationships between different versions of the virus, helping to track the origin of a pandemic. From this and other information, they can reconstruct how and where the virus may have spread from person to person.

Sloppy copies

Much of the power of genomic epidemiology stems from the fact that most viruses make lots of mistakes when they copy their genomes, so changes in the sequence — that is, new mutations — turn up relatively often. That’s especially true of viruses that use RNA as their genetic material, as coronaviruses do. Very few of these mutations affect how the virus behaves — most have no apparent consequence at all — but researchers can use them as markers to build a family tree of the virus and to see how the virus has changed over time and how it has spread from locale to locale.

Early in the Covid-19 outbreak, researchers all over the world began sequencing viruses sampled from patients and building a family tree of the virus on Nextstrain. Almost immediately, they could see that the tree was short — the virus sequences had not yet accumulated many distinct mutations, meaning that the new coronavirus, SARS-CoV-2, hadn’t been infecting humans for long. Moreover, the tree had a single trunk, indicating that every virus infecting humans likely descended from a single case in early December 2019.

In contrast, periodic outbreaks of MERS in humans in the 2010s look more like a shrubland: multiple small clusters of virus genotypes that are more closely related to camel viruses than to one another, indicating that MERS must have jumped repeatedly from camels to humans and then fizzled out.

The SARS-CoV-2 virus’s genetic mutability also means that epidemiologists can use changes in its genome to trace the spread of the virus during an epidemic. That’s because most mutations are essentially random, so each branch of the virus tree is likely to bear its own unique set of mutations. If one person’s virus contains mutations A, B and C, for example, that person could have caught it from someone whose virus carries A and B or A and C, but not from someone whose virus has A, B, C and D.

G-viral-genome-clues-alt-890x486.jpg
J.L. GARDY & N.J. LOMAN / NATURE REVIEWS GENETICS 2018 / KNOWABLE MAGAZINE
Mutations in a viral genome can serve as genetic breadcrumbs, giving scientists insight into viral origins and spread.

Early in the current pandemic, Nextstrain noted the appearance of identical or near-identical coronavirus genomes from people in countries as widely spaced as Canada, Australia and the UK. The genomes were so similar that scientists inferred they must have shared a common source. That red flag prompted further questioning, which revealed that all of the sick had recently travelled to Iran.

“We could confirm that these patients must have been infected in Iran, because that’s the only thing they had in common,” says Hodcroft. Without the genomes, nothing would have linked those patients, and the Iranian connection would not have been noticed as quickly. Similarly, most viral genomes in the New York City region closely match those seen earlier in Europe, suggesting that infections came from there, not directly from China.

Of course, epidemiologists also track transmission routes the traditional way, by interviewing people and tracing their contacts. However, this method can’t keep up in the face of a pandemic, where thousands of new cases are added every day.

“There’s an advantage to old-fashioned shoe-leather contact tracing, because you can actually talk to people and find out who they spoke to,” says Hodcroft. “But as the number of cases rises, you cannot contact-trace everyone. You just don’t have enough people. That’s where using genetics can be a big help.”

Viral family tree

Genomes can be especially good at answering a key public health question early in an epidemic: Are new infections in a given locality imported by travelers, or are they homegrown? The latter — the result of the virus circulating within the community — would create a need for the social-distancing measures now familiar to so many of us.

“If you’re seeing strains that are really, really similar, that suggests that they’re transmitting locally,” says Shirlee Wohl, a genomic epidemiologist at Johns Hopkins Bloomberg School of Public Health and coauthor of a review of the field in the 2016 Annual Review of Virology. “That’s information you really can’t get from any other method.”

G-covid19-ontario-transmission-733x795.png
This portion of the evolutionary tree of SARS-CoV-2 virus shows three separate clusters of virus from Covid-19 patients in Ontario, Canada (red dots). Within each cluster, viruses are closely related, indicating local transmission, but the three clusters are more distantly related, indicating that each cluster was introduced separately from elsewhere. The most likely source is the US, based on the similarities in the viral sequences.

For example, the first Covid-19 infection in the state of Washington was in a traveler returning from Wuhan, China, where the outbreak began. When a later infection in Washington turned out to have a nearly identical sequence, this was strong evidence of community transmission — especially because the two individuals, though unacquainted, lived in the same county.

Unfortunately for genetic detectives, the Covid-19 virus changes a little too slowly for optimal tracking of transmission chains, Wohl notes. HIV, in contrast, mutates so quickly that each person usually carries a unique genotype, allowing epidemiologists to pinpoint the exact source of each new infection. For the Covid-19 virus, each viral lineage accumulates about 30 new mutations per year, which works out to about one new mutation per two links in the transmission chain. As a result, exactly the same viral genome sequence can be found in several people, so genome-trackers can narrow transmission down only to a handful of suspects.

Additional uncertainty comes from the fact that researchers can’t possibly sequence viruses from every infected individual in a widespread pandemic. As of April 20, nearly 2.5 million people worldwide had been infected with SARS-CoV-2, but Nextstrain listed just 4,558 sequences. That can lead to false conclusions. “The beautiful danger is it looks like it can tell you a lot of enticing stories,” says Hodcroft. “But we don’t know that the scenario is exactly what happened.”

In late February, for example, sequencers found patients in Germany and Italy who shared the same unusual viral mutation. Since the German patient had gotten sick sooner, this led some researchers to suggest that the virus had spread from Germany to Italy. In reality, though, both German and Italian patients could have caught the virus from some third person, yet unidentified, whose virus was not sequenced.

Still, these limitations have not kept genomic epidemiology from playing a key role in the Covid-19 pandemic. The approach has helped public health officials identify the pathogen, trace its travels and recognize community spread promptly. And in the months ahead, the method may have more to contribute.

V-coronavirus-transmission-map-890x517.jpg
Using virus sequence data, researchers can track the spread of Covid-19 around the world. The animation starts in late 2019 and shows the first virus genome sequences found in January 2020 from Wuhan, China, with disease spreading rapidly in the weeks after.

One contribution is likely to come from longer-term studies of where mutations fall in the genome. Most of the genetic changes, remember, make little or no difference to the virus: They are “neutral,” in evolutionary biologists’ parlance. But mutations that change the shape of key proteins, such as the spike protein on the surface of the virus that binds to receptors in our cells, are more likely to matter.

Looking to see how these regions have changed since the virus infected humans may eventually help virologists understand why this particular virus has been able to adapt to us so well, says Hodcroft. However, this will require painstaking experiments over many months to reveal the functional effect of each mutation. “It’s not something that’s done in an afternoon,” she says.

Before that happens, genomic epidemiology promises to help public health officials find the smartest way to relax the burdensome social-distancing measures that are so important in controlling the pandemic right now. By using genomic breadcrumbs to track the transmission of the virus, epidemiologists hope to identify which activities are most likely to spread the virus. If schools, for example, turn out to pose a relatively low risk, authorities may be able to re-open those sooner.

“That hopefully means we can start relaxing those lockdowns faster than we might have 10 years ago, when we didn’t have this technology,” says Hodcroft. But that depends on a key factor that was not much in evidence at the start of the epidemic: the willingness of politicians to heed scientists’ warnings and advice.

This article originally appeared in Knowable Magazine, an independent journalistic endeavor from Annual Reviews.

Enjoy reading ASBMB Today?

Become a member to receive the print edition monthly and the digital edition weekly.

Learn more
Bob Holmes
Bob Holmes

Bob Holmes is a science writer in Edmonton, Canada.

Related articles

Immune cells can adapt to invading pathogens
Kathleen Abadie, Elisa Clark & Hao Yuan Kueh
From the journals: JLR
Joseph Heath
2024 voter guide
ASBMB Today Staff

Get the latest from ASBMB Today

Enter your email address, and we’ll send you a weekly email with recent articles, interviews and more.

Latest in Science

Science highlights or most popular articles

Immune cells can adapt to invading pathogens
News

Immune cells can adapt to invading pathogens

April 20, 2024

A team of bioengineers studies how T cells decide whether to fight now or prepare for the next battle.

Hinton lab maps structure of mitochondria at different life stages
Member News

Hinton lab maps structure of mitochondria at different life stages

April 20, 2024

An international team determines the differences in the 3D morphology of mitochondria and cristae, their inner membrane folds, in brown adipose tissue.

National Academies propose initiative to sequence all RNA molecules
News

National Academies propose initiative to sequence all RNA molecules

April 19, 2024

Unlocking the epitranscriptome could transform health, medicine, agriculture, energy and national security.

From the journals: JLR
Journal News

From the journals: JLR

April 19, 2024

What can you do with artificial lipoproteins? A new key to angiogenesis. Flavonoids counteract oxidative stress. Read about recent papers on these topics.

Iron could be key to treating a global parasitic disease
Journal News

Iron could be key to treating a global parasitic disease

April 16, 2024

A study has found that leishmaniasis causes body-wide changes in iron balance, leading to red blood cell damage.

Environmental DNA is everywhere
News

Environmental DNA is everywhere

April 14, 2024

The ability to extract trace bits of DNA from soil, water, and even air is revolutionizing science. Are there pitfalls?