Rapid Computational Identification of Therapeutic Targets for Pathogens
Jonathan Allen | 20-ERD-062
Project Overview
Biological threats continue to persist and evolve as an important challenge to national security. There are multiple ways in which novel viral pathogens could emerge to pose a serious threat to human health. This project developed a pathogen target identification tool that can rapidly respond to a novel or emerging viral biological threat. A set of computational tools were developed that provide detailed information on the newly sequenced genes, their protein products and the drug target sites for the proteins that are best suited for biological countermeasure development. Three key innovations were developed in the project. 1) Development of a new extensive database of protein pocket structures with structure-based search algorithms to rapidly link novel protein targets with the complete collection of previously experimentally solved protein structures. 2) A novel clustering pipeline was introduced to group matching structures and associated small-molecule binding ligands into a consensus protein pocket with the associated small-molecule chemotypes predicted to fit in the pocket site. The matching experimentally solved structures were used to inform the value of different target sites. 3) Where there are viral protein targets with pockets structurally matched to similar human proteins, a biological knowledge graph, which links molecular interactions with human disease, was used to further assess the potential negative impact of a viral protein target with similarities to human proteins that could have important off target side effects. In total, the project produced a new resource for rapid and detailed assessment of promising targets for countermeasures, reflecting the ongoing wet lab, clinical, and computational data being collected. These capabilities will improve the ability to respond to a biological threat in multiple domains.
Mission Impact
Our collection of software tools is designed to effectively search existing experimental biological data to rapidly report on the protein sites to target for novel viral pathogens. The software exploits the high-performance computing resources by employing parallelized protein structure search algorithms, which allow new target proteins to be searched against an extensive reference collection of protein-ligand structure interfaces. This work is supporting the high-performance computing, simulation and data science Core Competency. In addition, this is supporting a core capability in advancing national security concerns in support of bioassurance and biosecurity. It is expected that the newly developed tools will provide important insights in prioritizing protein targets as novel pathogens may emerge with little knowledge on therapeutic development.
Publications, Presentations, and Patents
Accepted - "A Computational Pipeline to Identify Broad-Spectrum Drug Targets and Interacting Chemotypes in Viral Pathogens," Gordon Research Seminar on Chemical and Biological Defense, March 19, 2023.
Sarah Sandholtz, "A Computational Pipeline to Identify and Characterize Binding Sites and Interacting Chemotypes in SARS-CoV-2" (Presentation, American Chemical Society Fall 2022 National Meeting, Chicago, IL, August 22, 2022).
Sarah Sandholtz, "A Computational Pipeline to Identify Broad-Spectrum Drug Targets and Interacting Chemotypes in Viral Pathogens" (Presentation, DTRA CBD S&T Conference, San Francisco, CA, December 8, 2022).
Sarah Sandholtz, "Decision support to prioritize novel therapeutic targets in response to a viral outbreak" (Presentation, CLSAC 2020, Annapolis, MD, October 6, 2020).
Kimbrel, J., Moon, J., Avila-Herrera, A., Martí, J. M., Thissen, J., Mulakken, N., Sandholtz, S. H., Ferrell, T., Daum, C., Hall, S., Segelke, B., Arrildt, K. T., Messenger, S., Wadford, D. A., Jaing, C., Allen, J. E., & Borucki, M. K. (2022). "Multiple Mutations Associated with Emergent Variants Can Be Detected as Low-Frequency Mutations in Early SARS-CoV-2 Pandemic Clinical Samples." Viruses. 14(12), 2775. https://www.mdpi.com/1999-4915/14/12/2775
Posada, R., Silva, M., Torres, M., Allen, J., Drocco, J., Sandholtz, S., & Zemla, A. (2022). "Graph-based featurization methods for classifying small molecule compounds." 14. https://doi.org/https://doi.org/10.5070/m414157338 Journal Name: UC Merced Undergraduate Research Journal; Journal Volume: 14; Journal Issue: 1
Sandholtz, S. H., Drocco, J. A., Zemla, A. T., Torres, M. W., Silva, M. S., & Allen, J. E. (2022). "A Computational Pipeline to Identify and Characterize Binding Sites and Interacting Chemotypes in SARS-CoV-2." ACS Omega 2023 8, 24, 21871–21884 https://doi.org/10.1021/acsomega.3c01621
Zemla, A. T., Allen, J. E., Kirshner, D., & Lightstone, F. C. (2022)."PDBspheres: a method for finding 3D similarities in local regions in proteins." NAR Genomics and Bioinformatics, 4(4). https://doi.org/10.1093/nargab/lqac078
Lau, E. Y., Negrete, O. A., Bennett, W. F. D., Bennion, B. J., Borucki, M., Bourguet, F., Epstein, A., Franco, M., Harmon, B., He, S., Jones, D., Kim, H., Kirshner, D., Lao, V., Lo, J., McLoughlin, K., Mosesso, R., Murugesh, D. K., Saada, E. A., . . . Allen, J.E., Lightstone, F. C. (2021). "Discovery of Small-Molecule Inhibitors of SARS-CoV-2 Proteins Using a Computational and Experimental Pipeline." Front Mol Biosci, 8, 678701. https://doi.org/10.3389/fmolb.2021.678701
G.A. Stevenson, D. Jones, H. Kim, W.F.D. Bennett, B.J. Bennion, M. Borucki, F. Bourguet, A. Epstein, M. Franco, B. Harmon, S. He, M. P. Katz, M, D. Kirshner, V. Lao, E. Y. Lau, J. Lo, K. McLoughlin, R. Mosesso, D.K. Murugesh, J.E. Allen, "High-Throughput Virtual Screening of Small Molecule Inhibitors for SARS-CoV-2 Protein Targets with Deep Fusion Models" (Presentation, SC '21 St. Louis, MO, Nov 2021).
Shim, H., Kim, H., Allen, J. E., & Wulff, H. (2022). "Pose Classification Using Three-Dimensional Atomic Structure-Based Neural Networks Applied to Ion Channel-Ligand Docking." Journal of Chemical Information and Modeling, 62(10), 2301-2315. https://doi.org/10.1021/acs.jcim.1c01510