Browse By

Using Quantum Computing Algorithms to Analyze Gene Networks

SYNGAP-1 related disorders (SRD) is a rare genetic disorder caused by a variant on the SYNGAP-1 gene. SRD typically expresses itself in early childhood in the form of epilepsy, global development delay, severe behavioral problems, and a multitude of other complications. Understanding the mechanisms of SRD could lead to understanding many other aspects of autism, cerebral palsy, and other neurological disorders. SRD disrupts the lives of many children and families, and the cost of supporting a child with intellectual disability is four to six times greater than neurotypical children, according to the National Library of Medicine. A cure for SRD is yet to be found. 

Currently, quantum computing is an increasingly evolving field that utilizes quantum mechanics to solve complex problems. One such problem that quantum computing excels in rapidly computing is the Maximum Cut problem (Max-Cut), a problem involving the division of a network in relation to the weight between its connections. This study employed an open-source quantum algorithm designed to compute solutions to the Max-Cut problem and applied it to co-expression genetic networks associated with three distinct SRD symptoms. The study provided a deeper level of insight into SRD co-expression networks and SYNGAP-1 gene mechanisms. The results of this study exceeded current genetic networking connections. The study also generated relative cut values for partitioning these co-expression networks to extract SYNGAP 1, a novel variable that could have implications for the future of gene therapy. This study examined SYNGAP-1, however its analysis of genetic networks has the potential to apply to various genetic editing techniques as well as to genetic disorder diagnosis. 

Introduction 

This study examined three different phenotypic traits of SRD. The phenotypic traits examined were epilepsy, behavioral disorders, and global development delay (GDD). 

  1. Epilepsy: Epilepsy is one of the most common symptoms of SRD. Epilepsy is categorized by subtle eyelid flutters, brief jerks, staring seizures, and drop seizures [1]. This study used a sample of genes associated with early childhood epilepsy, which included SCN1A, STXBP1, SNAP25,  and SYT1 [2].  
  2. Behavioral Disorders: In early childhood, SRD can also exhibit itself in the form of behavioral disorders, including sensory sensitivities, language difficulties, and repetitive behaviors [3]. This study used a sample of genes associated with various behavioral disorders,  which included L2HGDH [4], ABAT, and SERPINA7 [5].
  3. Global Development Delay (GDD): SRD is known to exhibit itself in the form of GDD as well [1]. GDD is categorized as a significant delay in motor, cognitive, speech, or social skills,  and is a term used for children under five years of age [6]. This study used a sample of genes associated with DNA repair and replication, and mutations of which could lead to GDD. This sample included RAD51 which plays a role in homologous recombination, mutations of which have been found to contribute to brain development [7].  

This study used an open-source algorithm developed for the purpose of quantum simulation [8]. Specifically, the algorithm used was created for optimizing the solution of a  Maximum Cut (Max-Cut) problem. The Max-Cut problem involves optimizing a network design. Its primary goal is to maximize the number of connections that are cut between two subsets of nodes in a graph, in order to partition the graph into two groups. The Max-Cut problem has various applications, including minimizing communications costs, maximizing bandwidth, analyzing social networks, and partitioning molecular graphs. This study is one of the first practical applications of applying the Max-Cut problem to genetic networks. The Max-Cut problem is classified as an NP-complete problem, meaning that no efficient solution algorithm has been found. Quantum computing algorithms, such as Quantum Approximate Optimization  Algorithm (QAOA), aim to optimize the most efficient solution to the Max-Cut problem given  the context of nodes and weights. Quantum simulation was used for this study because it offers a potential advantage over classical computing. Converting the Max-Cut problem to a quadratic program for quantum simulation can theoretically offer a more accurate analysis of the network and offer the ability to compute the problem exponentially faster when more nodes (genes) are added to the network in comparison to classical computing [9]. 

Related Work 

Extensive work has been put into developing sustainable quantum computers. There has also recently been a considerable amount of research done to examine the impact of SYNGAP1 on proteins and neurodevelopmental disorders. Research suggests that quantum computing is a reliable and efficient method of computing large-scale bioinformatics problems. However, there is a considerable lack of practical research in computational biology with the usage of quantum computing. 

Currently, efforts in the progression of quantum computing have been directed towards the development of large-scale quantum computers. Very recently, researchers at Google  Quantum AI developed Willow, a quantum chip capable of performing large-scale quantum error correction. Willow was also capable of solving large-scale problems at a staggeringly faster time  than current supercomputers [10]. Willow has the potential to find application in many different fields, including drug discovery and the development of sustainable technology. The study was conducted through a quantum simulation using Qiskit, a development kit for programming quantum computers. Moreover, further advances in quantum hardware have the potential to be the catalyst for advances in bioinformatics. 

In May 2023, a study published in Springer Nature explored the efficacy of quantum computing in the context of computational biology. The study explored the implications of quantum computing in protein folding, molecular dynamics, and bioinformatics. They concluded although knowledge gaps were present in the field of quantum computing, the efficiency and effectiveness of quantum computing proved reliability, and the potential to compute accurate and sustainable biological data justified the usage of quantum computers as opposed to classical computers [11]. My study focuses on the optimization of the Max-Cut algorithm in the context of genetic networking to ensure accurate and reliable results. 

In September 2024, a study published by Oxford Academic utilized AlphaFold2, an AI  system from Google’s DeepMind to simulate a 3D model of the SYNGAP1 structure.  Specifically, AlphaFold2 was used to reveal the impacts of missense mutations on the function of SYNGAP1. The relation between Ras GTPase, a protein that is encoded by SYNGAP1 which is responsible for growth, differentiation, and synaptic plasticity, and SYNGAP1 under a missense mutation was examined. The study concluded that extensive unfolding is not a prerequisite for SYNGAP-associated nonsyndromic intellectual disability (NSID). The structural insights provided in this study offered the potential for improved clinical diagnosis as well as structure-based drug discovery for SYNGAP [12]. My study aims to further examine the interactions between SYNGAP1 and associated genes rather than examine its role in protein structure. 

In May 2021, a study published on PubMed delved into predictions of Protein-Protein  Interactions (PPI) by expressing them as signed networks, in which the nodes of the graph represented proteins, and the edges represented the interactions of protein nodes [13]. Adjacency matrices were also created to interpret the connections between nodes. Multiple methods were used to predict the PPIs in the context of signed networks. My study focuses on gene network analysis; however, I use the signed network expression heavily in order to convert my network into a DocPlex format.

To begin, this study utilized the GeneMANIA database after the identification of genes relating to the three different symptoms previously mentioned. GeneMANIA is a real-time multiple association network integration algorithm for predicting gene function [14]. GeneMANIA can predict gene function with state-of-the-art accuracy [14]. For this study, co-expression networks were specifically utilized. Although the GeneMANIA search returned a large-scale network, the network needed to be truncated to a sample of around seven to nine genes due to current quantum computing simulation limitations. For the GDD network, due to the lack of data supporting a direct co-expression connection between a GDD associated gene and SYNGAP1, the gene XRCC2 (a homologous recombination facilitator) was used as a  ‘bridge’ between SYNGAP1 and the rest of the network. However, a direct connection was found between the epilepsy and behavioral disorder networks. In total, this study used Qiskit installed on a virtual Python environment, as well as the network imaging and data provided through the GeneMANIA database. 

Network Data 

As an example, this is what a co-expression network of genes associated with GDD looked like.

An image depicting a genetic network with GDD-associated genes, as well as SYNGAP1.

Along with the network image, the network data was also exported from GeneMANIA, which is  shown here. 

Raw exported network data.

To properly format the data into a Docplex model for the QAOA algorithm, the majority of the network needed to be truncated, as the current simulator did not allow for large scale analysis due to computational limits. Each gene in the new smaller network was also assigned an integer, as per the Docplex model’s requirements. This truncated data is shown here.

Formatted network data.

Quantum Computation of Formatted Network Data 

Afterward, the output is yielded through the Max-Cut algorithm, which is depicted here. 

A flow chart depicting the steps that the Max-Cut algorithm takes, starting from the edited network input and ending with the final grouping, as well as a network cut weight relative to the weights of the genetic connections present in the network.

Results 


The result of the Max-Cut partitions involves an interesting separation of genes. For the epilepsy network, group one involves genes associated with synaptic vesicle fusion such as SNAP25, SYT, and STX2. However, SCN1A is an outlier, as it relates to triggering neurotransmitter releases. Group two involves SYNGAP1, and it groups it with genes associated with synaptic transmission such as SLC18 and SNCA. This relation reflects SYNGAP1’s phenotypical association with synaptic transmission rather than synaptic fusion. The GDD network is especially interesting because of the nature of the network. Visually, the RAD51 gene was a hub gene, meaning that it had a direct connection to many other genes in the network. We can see this in Figure 1. However, even though the SYNGAP1 gene was on the outside of the network and had only one direct connection to the XRCC2 gene, the algorithm decided to group only the hub gene and SYNGAP1 together. RAD51 encodes a protein crucial to homologous recombination, and the fact that it had a connection to SYNGAP1 indicates that the Max-Cut algorithm is seeing a connection beyond the data that it is given. This connection can be interpreted as potential functional modules beyond co-expression.

The results of the behavioral network are also exciting. It associates SYNGAP1 with  SERPINA7, NAGLU, and PMM1, which are associated with lysosomal function, metabolism, and hormone transport. However, the second group also has similar traits, however, involves  XRCC2, which aids in DNA repair. This could have many interpretations. We see that the algorithm tends to associate SYNGAP1 with lysosomal function rather than DNA repair, which differs from previous experiments. It appears that the algorithm sees SYNGAP1 in this function when compared to the larger gene cluster.

Another interesting data point is the objective function value. The objective function value (OFV) provides a quantitative measure of how well the network was partitioned. Epilepsy and behavioral networks shared a very similar OFV. However, the GDD network had a noticeably smaller OFV. This means that the algorithm was able to discern the epilepsy and behavioral disorder genes better than the GDD network. This is evident in the sheer number of genes present in each network. In the GDD network, the majority of genes were placed in group one. In total, the analysis of the results offers a novel interpretation of gene networks and the implications they have on phenotypic traits.

Based on these results, this study has demonstrated that the usage of quantum computing algorithms can open a new level of genetic network understanding. Not only was the algorithm able to group genes based on function, but it also provided novel connections that were not present in the data that it was given. Understanding these connections demonstrates the potential of quantum computing’s efficacy beyond the symptoms of SYNGAP1, but in the greater expansion of large-scale genetic networks. Improvements in this technology have the potential to lead to new treatment for individuals living with SYNGAP1. Furthermore, the usage of the OFV variable could help guide technologies geared toward gene editing as a rapid and reliable source of network analysis. However, further research is required to increase the impact of this new tool.

1. The program should be run on quantum hardware rather than a simulator to ensure OFV  accuracy.

2. The program should be run on quantum hardware that processes seventy or more qubits so that large-scale networks can be run without the need to truncate them.

3. The program should be optimized a few more times before returning results, ensuring reliable gene grouping.

This study proves the reliability, value, and novelty that quantum computing will have in guiding practical bioinformatics applications, which will continue to impact the future of genetic diagnosis and therapy.

Acknowledgements

I want to thank the insanely vast contributors to the Qiskit community and IBM Quantum for their extensive open-source contributions to the quantum computing field. I would also like to thank the team of researchers at the University of Toronto for the user-friendly GeneMANIA database.

Citations:

[1] Syngap Research Fund. (n.d.). What is SYNGAP1-related disorders?. Cure Syngap1.  https://curesyngap1.org/what-is-syngap1/

[2] Wang, J., Lin, Z. J., Liu, L., Xu, H. Q., Shi, Y. W., Yi, Y. H., … & Liao, W. P. (2017).  Epilepsy-associated genes. Seizure, 44, 11-20.

[3] Wright, D., Kenny, A., Eley, S., McKechanie, A. G., & Stanfield, A. C. (2022). Clinical and  behavioural features of SYNGAP1-related intellectual disability: a parent and caregiver  description. Journal of neurodevelopmental disorders, 14(1), 34.

[4] Ma, S., Sun, R., Jiang, B., Gao, J., Deng, W., Liu, P., … & Guan, K. L. (2017). L2hgdh  deficiency accumulates l-2-hydroxyglutarate with progressive leukoencephalopathy and  neurodegeneration. Molecular and cellular biology.

[5]Supplementary Table 1. 2742 confirmed disease-causing genes targeted in this study

[6] Habibullah, H., Albradie, R., & Bashir, S. (2019). Identifying pattern in global developmental  delay children: A retrospective study at King Fahad specialist hospital, Dammam (Saudi  Arabia). Pediatric Reports, 11(4).

[7] Thomas, M., Dubacq, C., Rabut, E., Lopez, B. S., & Guirouilh-Barbat, J. (2023). Noncanonical roles  of RAD51. Cells, 12(8), 1169.

[8] Stachoń, M. A. Solving Optimization problems using Qiskit Aqua.

[9] Dupont, M., Sundar, B., Evert, B., Neira, D. E. B., Peng, Z., Jeffrey, S., & Hodson, M. J. (2024).  Quantum Optimization for the Maximum Cut Problem on a Superconducting Quantum Computer. arXiv  preprint arXiv:2404.17579.

[10] Neven, H. (2024). Meet Willow, our state-of-the-art quantum chip. Google,(9 December  2024)[Online] https://blog. google/technology/research/google-willow-quantum-chip/(accessed 9  December 2024).

[11] Pal, S., Bhattacharya, M., Lee, S. S., & Chakraborty, C. (2024). Quantum computing in the next generation computational biology landscape: From protein folding to molecular dynamics. Molecular  biotechnology, 66(2), 163-178.

[12] Ali, A. E., Li, L. L., Courtney, M. J., Pentikäinen, O. T., & Postila, P. A. (2024). Atomistic  simulations reveal impacts of missense mutations on the structure and function of SynGAP1. Briefings in  Bioinformatics, 25(6), bbae458.

]13] Xiang, Z., Gong, W., Li, Z., Yang, X., Wang, J., & Wang, H. (2021). Predicting protein–protein  interactions via gated graph attention signed network. Biomolecules, 11(6), 799.

[14] Mostafavi, S., Ray, D., Warde-Farley, D., Grouios, C., & Morris, Q. (2008). GeneMANIA: a real time multiple association network integration algorithm for predicting gene function. Genome biology, 9,  1-15.





Leave a Reply

Your email address will not be published. Required fields are marked *