The Phylogenetic Tree of the SARS-CoV-2 Virus

SARS-CoV-2 is a beta-coronavirus that is thought to be originally bat-borne before jumping to humans to cause the COVID-19 pandemic.

Image Credit:

A majority of mutations that have occurred to date have only mild effects on the virus, and none are thought to have led to a novel or more dangerous strain at present. This gives great hope for successful vaccinations and treatments. It is not unexpected, that in the future, the virus will evolve to different strains like influenza.

What is a Phylogenetic Tree?

Phylogenetic trees are diagrammatic representations of the evolution between different related species based on their genetic and physical similarities. At a broader level, phylogenetic trees originate from a ‘common ancestor’ giving rise to the multitude of life including bacteria, archaea, and Eukaryota.

Viruses are biological species, in the sense that they comprise of nucleic acid sequences and are subject to a wide array of evolutionary changes including mutations to their sequences. Whilst viruses are not living organisms, the ability of viral nucleic acid sequences to accumulate changes through mutations or recombination with other species gives rise to novel viral lineages.

SARS-CoV-2 Phylogenetic Tree

The severe acute respiratory syndrome coronavirus (SARS-CoV-2) is the cause of the ongoing global pandemic of coronavirus disease-2019 (COVID-19) that originated around mid-December 2019 in Wuhan, Hubei Province of China. As of March 11, 2020, COVID-19 was declared a global pandemic by the World Health Organisation and the virus had spread to most nations.

SARS-CoV-2 is a lineage-b beta-coronavirus belonging to the coronaviridae family. This family belongs to the order nidovirales, of the pisonivirecetes class, of the pisuviricota phylum, of the orthomavirae kingdom, of the ribovaria realm. As such, the virus has an RNA genome (+ssRNA with single linear arrangement) with RNA-dependent RNA polymerase (RdRp) which produces RNA from RNA.

Lineage-b beta coronaviruses include the SARS-CoV virus that causes SARS and both bind to the ACE2 receptor. However, unlike SARS-CoV, SARS-CoV-2 contains an evolutionary distinct and proteolytically sensitive activation loop (furin-like cleavage site) that is thought to be the reason behind its increased pathogenicity and transmissibility.

The origin of SARS-CoV-2 is considered to be bat-borne due to the close genetic similarity to bat coronaviruses (96%). There is no concrete evidence to suggest that another host was a reservoir for the virus before transmission to humans, although the virus shares up to 92% similarity to pangolin coronaviruses.

Some evidence has suggested that it may be that bat-borne SARS-CoV-2 jumped to pangolins, back to bats (incorporating some pangolin homology), and then to humans.

Image Credit: Kateryna Kon/

Recent Mutations to SARS-CoV-2

Analysis of the SARS-CoV-2 virus across different nations at different times of the pandemic have revealed that the virus has undergone several mutations, some which do not seem to have any significant impact on the virus, whereas other mutations are thought to be more significant and may allude to strain divergency – though whether this means the virus has become more virulent is not immediately clear, and it may even lead to the virus becoming weaker than the current form.

It is inevitable that as time progresses, the virus will accumulate independent mutations in different locations. To date, most of the mutations that have occurred are only moderately genetically diverse with an average pairwise difference of 9.6SNPs between any two genomes (according to one study); which shows that the common ancestor is recent and has a mutation rate of around 6×10-4 nucleotides/genome/year.

The specific mutations that have occurred in SARS-CoV-2 are largely neutral, although some are allowing the virus to adapt more to the human host. However, one of the strongest diversions has occurred at site 11083 of Orf1a which encodes Nsp6. This is thought to be the site that results in CD4+/CD8+ T-cells. Changes within this region may account for the differences in immune responses to SARS-CoV-2.

One study has classified some mildly different strains of SARS-CoV-2. Type A refers to the original Chinese variant (two sub-clusters with a mutation at T29095C). Type B is also present in Asian countries, the US, and Europe. Type B diverges from Type A by 2 mutations: T8782C and C28144T – the latter resulting in the changing of leucine to a serine. Type C differs from Type B at G26144T (glycine to valine) and is the major European variant, and largely absent in China.

Despite the presence of mild divergent strains that comprise specific mutations at specific locations (Types A-C) which are geographically different, it is important to stress, that at present there are no distinct divergent strains of SARS-CoV-2 and any vaccines targeting the current strain should work effectively.

Due to the zoonotic nature of SARS-CoV-2, it is next to impossible to predict the trajectory of the future phylogenetic diversity of the virus, and how it may adapt and evolve to infect humans in different ways.

As with influenza viruses; which have several divergent strains, SARS-CoV-2 may also diverge into multiple strains with differing rates of valence and transmissibility. This would be the biggest concern for any vaccine development, and only time will tell if such bigger divergent strains develop.

In summary, SARS-CoV-2 shares high homology to bat-coronaviruses and as such is thought to be bat-borne. The role of an intermediate host reservoir, thought to be pangolins, is still to be confirmed. Whilst numerous mutations have occurred within SARS-CoV-2 giving rise to distinct geographical variants, none at present are thought to have strongly diverged to create a novel strain to the current SARS-CoV-2 virus in circulation.


  • Jaimes et al, 2020. Phylogenetic Analysis and Structural Modeling of SARS-CoV-2 Spike Protein Reveals an Evolutionary Distinct and Proteolytically Sensitive Activation Loop. J Mol Biol. 2020 May 1; 432(10): 3309–3325.
  • Stafanelli et al, 2020. Whole-genome and phylogenetic analysis of two SARS-CoV-2 strains isolated in Italy in January and February 2020: additional clues on multiple introductions and further circulation in Europe. Euro Surveill. 2020 Apr 2; 25(13): 2000305.
  • Forster et al, 2020. Phylogenetic network analysis of SARS-CoV-2 genomes. Proc Natl Acad Sci USA. 117(17):9241-9243
  • Van Dorp et al, 2020. Emergence of genomic diversity and recurrent mutations in SARS-CoV-2. Infection, Genetics & Evolution 83:104351

Further Reading

  • All Coronavirus Disease COVID-19 Content
  • What Mutations of SARS-CoV-2 are Causing Concern?
  • What is the Clinical Impact of COVID-19 on Cancer Patients?
  • Can Pets Get COVID-19?
  • An Overview of the SARS-CoV-2 Vaccines

Last Updated: Jul 20, 2020

Written by

Osman Shabir

Osman is a Neuroscience PhD Research Student at the University of Sheffield studying the impact of cardiovascular disease and Alzheimer's disease on neurovascular coupling using pre-clinical models and neuroimaging techniques.

Source: Read Full Article