The global outbreak of the Novel Corona virus, commonly known as COVID 19 has reached pandemic status with a record of 2.5 million reported cases and 171 thousand reported deaths worldwide. With the quick spread of the infection reaching 210 countries, scientists around the world have engaged in the mission of formulating a vaccination against the specific strain of the corona virus named COVID 19.
Vaccines mimic the infection and stimulate the immune system against pathogens by producing defensive antigens in the body. The vaccines can be classified into three main types:
- Whole pathogen vaccines
- Sub-unit vaccines
- Nucleic Acid vaccines
Whole pathogen vaccines are traditional inactivated vaccines containing inactivated or killed microbes, used to induce an immune response against that specific pathogen to prevent diseases like hepatitis A, polio, and influenza.
Sub unit vaccines are made up of only the most useful protein part of the antigens from the microbe, to help stimulate the immune system against diseases like shingles and pertussis.
Whereas the Nucleic acid vaccines comprise the specific genetic material which encodes the antigen for the desired immune response. Once incorporated, these small DNA or mRNA molecules are used to create similar antigens in the host body as immune response. Due to its long term immune response, easy large scale manufacture, and quick effectiveness, the COVID 19 viruses are expected to be tackled by Nucleic acid vaccines.
Recently, the Harvard T.H. Chan School of Public Health and the Human Vaccines Project announced the Human Immunologic Initiative, a joint effort proposing to use artificial intelligence models to accelerate vaccine development.
Researchers from The University of Texas and the University of Washington’s Institute for Protein Design are using AIML (Artificial Intelligence Modelling Language) modeling for vaccine preparation.
While a huge amount of structured and unstructured data is coming out of biotechnology labs and health care centers around the globe, tech giants like Microsoft and Alibaba have dedicated their resources to develop AI models to gather and process the data to help detect and formulate vaccination against COVID 19.
So how is AI being used to develop vaccines against coronavirus?
Well, the virulent part of any virus is its protein, made of a unique sequence of amino acids that define its shape and hazardous nature. Now to make a Nucleic Acid vaccine, this protein structure must be replicated and its unique 3D structure must match that of the virus’. So to analyze, study, and explore all possible shapes of the protein structure, we need the predictive modeling of Artificial Intelligence systems.
Google DeepMind introduced AlphaFold in January 2020. It is a system that predicts the 3D structure of a protein-based on its genetic sequence. The system was put to the test on COVID-19 in March. DeepMind also released protein structure predictions of several under-studied proteins associated with COVID-19, to help the research community better understand the virus and its structure.
University of Washington’s Institute for Protein Design used computer models to develop 3D atomic-scale models of the SARS-CoV-2 spike protein that found a match with those discovered in the University of Texas Austin lab.
Researchers from The University of Texas at Austin and the National Institutes of Health created the first 3D atomic-scale map of the spike protein which attaches the host cells to the virus protein during the attachment phase of the virus life cycle. AlphaFold provided an accurate prediction for this spike structure, that helped them with their breakthrough achievement.
Now with the urgency of finding a vaccine, a lot of research works are getting published and contributing to the library of scientific literature. Hence, to find the papers relevant to one’s specific research is becoming harder with every passing day. AI provides a solution to this problem as well by segregating and processing unstructured data to structured data using NLP or Natural Language Processor AI.
Allen Institute for AI in collaboration with several research organizations has decided to produce a unique resource of 44 thousand scholarly articles about COVID-19, SARS-CoV-2, and related coronaviruses, named COVID-19 Open Research Dataset (CORD-19). This resource will be useful as it is free and gets updated daily whenever any new research is published. The data is also machine-readable and natural-language processing algorithms can be applied to them for data processing.