{"id":962,"date":"2021-02-01T09:12:31","date_gmt":"2021-02-01T15:12:31","guid":{"rendered":"https:\/\/sites.imsa.edu\/hadron\/?p=962"},"modified":"2021-02-01T09:13:27","modified_gmt":"2021-02-01T15:13:27","slug":"962","status":"publish","type":"post","link":"https:\/\/sites.imsa.edu\/hadron\/2021\/02\/01\/962\/","title":{"rendered":"The Language of a Mutated Virus"},"content":{"rendered":"<p><em><span style=\"font-weight: 400\">Written by Gloria Wang<\/span><\/em><\/p>\n<p><span style=\"font-weight: 400\">Natural Language Processing (NLP) is a branch of AI that deals specifically with the communication between computers and people using human language. But aside from being able to understand languages like English, Chinese, and German, NLP algorithms are now able to understand the language of genes. A team of researchers from MIT recently used a combination of NLP algorithms designed for modeling protein sequences and genetic codes to predict mutations that allow viruses to avoid detection of antibodies in the immune system, a process known as viral immune escape.<\/span><\/p>\n<p><span style=\"font-weight: 400\">As with all machine learning systems, NLP models must be trained. But instead of training this model on sentences and phrases, MIT researchers used tens of thousands of genetic sequences from three different viruses: influenza, HIV, and SARS-CoV-2, better known as the coronavirus. Their goal is to identify mutations that allow viral immune escape, or, in terms of linguistics, \u201cmutations that change a virus\u2019s meaning without making it grammatically incorrect\u201d (Heaven 2021).\u00a0<\/span><\/p>\n<p>&nbsp;<\/p>\n<p style=\"text-align: center\"><span style=\"font-weight: 400\">Figure 1<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-963\" src=\"http:\/\/sites.imsa.edu\/hadron\/files\/2021\/02\/hadron.png\" alt=\"\" width=\"646\" height=\"101\" srcset=\"https:\/\/sites.imsa.edu\/hadron\/files\/2021\/02\/hadron.png 646w, https:\/\/sites.imsa.edu\/hadron\/files\/2021\/02\/hadron-300x47.png 300w, https:\/\/sites.imsa.edu\/hadron\/files\/2021\/02\/hadron-352x55.png 352w, https:\/\/sites.imsa.edu\/hadron\/files\/2021\/02\/hadron-400x63.png 400w\" sizes=\"auto, (max-width: 646px) 100vw, 646px\" \/><\/p>\n<p style=\"text-align: center\"><i><span style=\"font-weight: 400\">Image of<\/span><\/i> <em><span style=\"font-weight: 400\">\u201cMutated\u201d sentences compared to the original sentence<\/span><\/em><\/p>\n<p style=\"text-align: center\"><span style=\"font-weight: 400\">Source:<\/span> <span style=\"font-weight: 400\">Hie 2021<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"font-weight: 400\">For example, take the following sentences \u201cmutated\u201d from the original \u201cwinegrowers revel in <\/span><i><span style=\"font-weight: 400\">good <\/span><\/i><span style=\"font-weight: 400\">season\u201d: \u201cwinegrowers revel in <\/span><i><span style=\"font-weight: 400\">strong <\/span><\/i><span style=\"font-weight: 400\">season,\u201d and \u201cwinegrowers revel in <\/span><i><span style=\"font-weight: 400\">flu <\/span><\/i><span style=\"font-weight: 400\">season.\u201d Both variations have the same grammatical structure, but one has changed the meaning of the sentence significantly more than the other. The virus mutation where the meaning has changed the most significantly is the one that is flagged as mutations which allow viral immune escape.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Comparing their predictions of escape mutations to real viruses in the lab, researchers found that accuracy ranged from area under the curve (AOC) scores of 0.69 to 0.85, better than many state-of-the-art models. This procedure shows serious potential for public health. Understanding which mutations can go undetected by last year\u2019s antibodies can help determine how well previous vaccines and antibodies will fare this year.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">Most notably, the team ran the model on new variants of coronavirus, including the highly infectious UK mutation, as well as variants from Denmark, Singapore, Malaysia, and South Africa\u2060\u2014 all in which a high potential for viral immune escape was found.<\/span><\/p>\n<p><span style=\"font-weight: 400\">And this is just the beginning. With this technology, scientists can get a better understanding of the world around us, extending NLP technologies beyond simply human language, and into the language of a mutated virus.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"font-weight: 400\">References<\/span><\/p>\n<p><span style=\"font-weight: 400\">Hie, B., et al. (2021). Learning the language of viral evolution and escape. Science, Vol. 371, Issue 6526, pp. 284-288, DOI: 10.1126\/science.abd7331. Retrieved 18 January 2021, from <\/span><a href=\"https:\/\/science.sciencemag.org\/content\/371\/6526\/284\"><span style=\"font-weight: 400\">https:\/\/science.sciencemag.org\/content\/371\/6526\/284<\/span><\/a><span style=\"font-weight: 400\">.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">Heaven, W. (2021). AIs that read sentences are now catching coronavirus mutations. MIT Technology Review. Retrieved 18 January 2021, from <\/span><a href=\"https:\/\/www.technologyreview.com\/2021\/01\/14\/1016162\/ai-language-nlp-coronavirus-hiv-flu-mutations-antinbodies-immune-vaccines\/\"><span style=\"font-weight: 400\">https:\/\/www.technologyreview.com\/2021\/01\/14\/1016162\/ai-language-nlp-coronavirus-hiv-flu-mutations-antinbodies-immune-vaccines<\/span><\/a><span style=\"font-weight: 400\">\/.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Written by Gloria Wang Natural Language Processing (NLP) is a branch of AI that deals specifically with the communication between computers and people using human language. But aside from being able to understand languages like English, Chinese, and German, NLP algorithms are now able to<\/p>\n","protected":false},"author":588,"featured_media":964,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"ngg_post_thumbnail":0,"footnotes":""},"categories":[9,13],"tags":[],"class_list":["post-962","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-biology","category-technology"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/sites.imsa.edu\/hadron\/wp-json\/wp\/v2\/posts\/962","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sites.imsa.edu\/hadron\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sites.imsa.edu\/hadron\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sites.imsa.edu\/hadron\/wp-json\/wp\/v2\/users\/588"}],"replies":[{"embeddable":true,"href":"https:\/\/sites.imsa.edu\/hadron\/wp-json\/wp\/v2\/comments?post=962"}],"version-history":[{"count":3,"href":"https:\/\/sites.imsa.edu\/hadron\/wp-json\/wp\/v2\/posts\/962\/revisions"}],"predecessor-version":[{"id":967,"href":"https:\/\/sites.imsa.edu\/hadron\/wp-json\/wp\/v2\/posts\/962\/revisions\/967"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/sites.imsa.edu\/hadron\/wp-json\/wp\/v2\/media\/964"}],"wp:attachment":[{"href":"https:\/\/sites.imsa.edu\/hadron\/wp-json\/wp\/v2\/media?parent=962"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sites.imsa.edu\/hadron\/wp-json\/wp\/v2\/categories?post=962"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sites.imsa.edu\/hadron\/wp-json\/wp\/v2\/tags?post=962"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}