Multiple Sequence Alignments with r-gt!

Author

Victor Yuan

Published

October 3, 2025

Multiple sequence alignment (MSA) is a computational technique to compare biological sequences, and identify regions of similarity and differences. This approach is used for identifying conserved functional domains, and understanding evolutionary relationships between related proteins.

For the 2025 Posit Table contest I wanted to explore how MSAs can be effectively visualized using the r package gt. This has been something I have wanted to do for a long time, and I’m excited to share this exploration.

The data we are working with is an MSA from (Wang et al. 2021), where they identified a potential universally conserved “weak spot” in Coronavirus spike proteins to specific cross-reactive monoclonal antibodies. The spike protein was the primary target for developing the COVID-19 vaccines, which crucially saved millions of lives during in the fight against the pandemic.

Figure 5 from their study presents an MSA comparing spike proteins from several coronavirus species: SARS-CoV, SARS-CoV-2, MERS-CoV, and HCoV-OC43 (which causes the common cold). This visualization left an immediate impression on me. Despite there being large divergence between these species, the authors were able to identify a region with enough similarity to serve as target for broadly reactive antibodies.

Let’s explore how we can leverage gt to better understand the evolutionary relationships between coronavirus protein sequences!

Visualizing Conserved Regions Across Coronvirus Spike Proteins

Click through the tabs below to explore each region of the coronavirus spike protein alignment in detail.

If you enjoyed this article and want to use some of these functions yourself, check out the development of gtseq

References

Wang, Chunyan, Rien van Haperen, Javier Gutiérrez-Álvarez, Wentao Li, Nisreen M. A. Okba, Irina Albulescu, Ivy Widjaja, et al. 2021. “A Conserved Immunogenic and Vulnerable Site on the Coronavirus Spike Protein Delineated by Cross-Reactive Monoclonal Antibodies.” Nature Communications 12 (1): 1715. https://doi.org/10.1038/s41467-021-21968-w.