Drug-Disease Association Analysis via Machine Learning on Extracted Features by Matrix Decomposition | ||
| Mathematics Interdisciplinary Research | ||
| دوره 10، شماره 3، آذر 2025، صفحه 295-313 اصل مقاله (1.14 M) | ||
| نوع مقاله: Original Scientific Paper | ||
| شناسه دیجیتال (DOI): 10.22052/mir.2025.256471.1506 | ||
| نویسندگان | ||
| Zahra Rafei؛ Seyedeh Fatemeh Hosseini؛ Behnam Yousefimehr؛ Mehdi Ghatee* | ||
| Department of Mathematics and Computer Science, Amirkabir University of Technology, Tehran, I. R. Iran | ||
| چکیده | ||
| Drug repurposing presents a cost-effective and time-efficient alternative to traditional drug discovery by identifying new therapeutic uses for existing medications. As biomedical data grows in scale and complexity, there is an increasing demand for predictive models that balance accuracy, interpretability, and computational efficiency. In this study, we systematically evaluate hybrid models that combine established matrix factorization techniques with machine learning regressors, with an emphasis on interpretable and lightweight models such as the Decision Tree Regressor. Using the widely adopted Fdataset, comprising 1,933 known associations between 593 drugs and 313 diseases, we demonstrate that several of these hybrid approaches achieve predictive performance comparable to or surpassing that of complex models like WNMFDDA, while significantly reducing memory usage and training time. Notably, our framework relies solely on the drug–disease association matrix, removing the dependency on auxiliary similarity data, which is often unavailable in real-world applications. Among the tested models, the NMF DecisionTreeRegressor offers the highest accuracy, making it ideal for accuracy-critical scenarios, while the Ridge model stands out for its efficiency and suitability for resource-constrained environments. To enhance transparency, we further apply LIME (Local Interpretable Model-Agnostic Explanations) to provide interpretable insights into model predictions. These findings highlight a practical and scalable framework for drug repurposing, particularly suited for environments with limited computational resources. Our approach supports the development of accessible, data-driven predictive tools that accelerate the transition from computational modeling to clinical application. | ||
| کلیدواژهها | ||
| Drug-disease association؛ Machine learning؛ Matrix decomposition؛ Hybrid models | ||
| مراجع | ||
|
[1] J. K. Yella, S. Yaddanapudi, Y. Wang and A. G. Jegga, Changing trends in computational drug repositioning, Pharmaceuticals 11 (2018) #57, https://doi.org/10.3390/ph11020057. [2] T. T. Ashburn and K. B. Thor, Drug repositioning: identifying and developing new uses for existing drugs, Nat. Rev. Drug Discov. 3 (2004) 673 - 683, https://doi.org/10.1038/nrd1468. [3] N. Nosengo, Can you teach old drugs new tricks? Nature 534 (2016) 314-316, https://doi.org/10.1038/534314a. [4] A. I. Graul, P. Pina, M. Tracy and L. Sorbera, The year’s new drugs and biologics 2019, Drugs Today 56 (2020) #47, https://doi.org/10.1358/dot.2020.56.1.3129707. [5] D. Sardana, C. Zhu, M. Zhang, R. C. Gudivada, L. Yang and A. G. Jegga, Drug repositioning for orphan diseases, Brief. Bioinform. 12 (2011) 346-356, https://doi.org/10.1093/bib/bbr021. [6] H. Yang, I. Spasic, J. A. Keane and G. Nenadic, A text mining approach to the prediction of disease status from clinical discharge summaries, J. Am. Med. Inform. Assoc. 16 (2009) 596- 600, https://doi.org/10.1197/jamia.M3096. [7] X. Chen and G. -Y. Yan, Semi-supervised learning for potential human microRNA-disease associations inference, Scientific reports 4 (2014) #5501, https://doi.org/10.1038/srep05501. [8] A. Gottlieb, G. Y. Stein, E. Ruppin and R. Sharan, PREDICT: a method for inferring novel drug indications with application to personalized medicine, Mol. Syst. Biol. 7 (2011) #496, https://doi.org/10.1038/msb.2011.26. [9] M. -N. Wang, X. -J. Xie, Z. -H. You, D. -W. Ding and L. Wong, A weighted non-negative matrix factorization approach to predict potential associations between drug and disease, J. Transl. Med. 20 (2022) #552, https://doi.org/10.1186/s12967-022-03757-1. [10] Z. Rafei, S. F. Hosseini, B. Yousefimehr, S. Tavakkoli and M. Ghatee, Optimizing drug-disease association analysis: a resource-efficient approach using numerical linear algebra and machine learning, Proceedings of the First International Conference on Machine Learning and Knowledge Discovery (MLKD 2024) (2024) 131 - 138. [11] D. S. Wishart, C. Knox, A. C. Guo, S. Shrivastava, M. Hassanali, P. Stothard, Z. Chang and J. Woolsey, DrugBank: a comprehensive resource for in silico drug discovery and exploration, Nucleic Acids Res. 34 (2006) D668 - D672, https://doi.org/10.1093/nar/gkj067. [12] A. Hamosh, A. F. Scott, J. Amberger, D. Valle and V. A. McKusick, Online mendelian inheritance in man (OMIM), Hum. Mutat. 15 (2000) 51 - 61, https://doi.org/10.1002/(SICI)1098-1004(200001)15:1<57::AIDHUMU12>3.0.CO;2-G. [13] H. -J. Jiang, Z. -H. You, K. Zheng and Z. -H. Chen, Predicting of drugdisease associations via sparse auto-encoder-based rotation forest, In International Conference on Intelligent Computing, Springer (2019) 369 - 380, https://doi.org/10.1007/978-3-030-26766-7_34. [14] C. -Q. Gao, Y. -K. Zhou, X. -H. Xin, H. Min and P. -F. Du, DDA-SKF: predicting drug-disease associations using similarity kernel fusion, Front. Pharmacol. 12 (2022) #784171, https://doi.org/10.3389/fphar.2021.784171. [15] G. Huang, Z. Liu, L. Van Der Maaten and K. Q. Weinberger, Densely connected convolutional networks, In Proceedings of the IEEE conference on computer vision and pattern recognition (2017) 4700 - 4708. [16] W. Zhang, X. Yue, W. Lin, W. Wu, R. Liu, F. Huang and F. Liu, Predicting drug-disease associations by using similarity constrained matrix factorization, BMC Bioinformatics 19 (2018) #233, https://doi.org/10.1186/s12859-018-2220-4. [17] H. Luo, C. Zhu, J. Wang, G. Zhang, J. Luo and C. Yan, Prediction of drug–disease associations based on reinforcement symmetric metric learning and graph convolution network, Front. Pharmacol. 15 #1337764, https://doi.org/10.3389/fphar.2024.1337764. [18] B. -W. Zhao, X. -R. Su, Y. Yang, D. -X. Li, G. -D. Li, P. -W. Hu, Y. -G. Zhao and L. Hu, Drug-disease association prediction using semantic graph and function similarity representation learning over heterogeneous information networks, Methods 220 (2023) 106 - 114, https://doi.org/10.1016/j.ymeth.2023.10.014. [19] V. T. Nguyen, D. H. Vu, T. K. P. Pham and T. H. Dang, CFMKGATDDA: A new collaborative filtering and multiple kernel graph attention network-based method for predicting drug-disease associations, Intelligence-Based Medicine, 11 (2025) #100194, https://doi.org/10.1016/j.ibmed.2024.100194. [20] D. D. Lee and H. S. Seung, Learning the parts of objects by non-negative matrix factorization, Nature 401 (1999) 788 - 791, https://doi.org/10.1038/44565. [21] L. Breiman, J. Friedman, R. A. Olshen and C. J. Stone, Classification and regression trees, Chapman and Hall/CRC, 2017. [22] A. E. Hoerl and R. W. Kennard, Ridge regression: Biased estimation for nonorthogonal problems, Technometrics 12 (1970) 55 - 67. [23] M. T. Ribeiro, S. Singh and C. Guestrin, " Why should I trust you?" explaining the predictions of any classifier, In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (2016) 1135 - 1144. [24] C. Steinbeck, Y. Han, S. Kuhn, O. Horlacher, E. Luttmann and E. Willighagen, Thec Chemistry development kit (CDK): An open-source java library for chemo- and bioinformatics, J. Chem. Inf. Comput. Sci. 43 (2003) 493 - 500, https://doi.org/10.1021/ci025584y. [25] T. T. Tanimoto, An Elementary Mathematical Theory of Classification and Prediction, International Business Machines Corporation, New York, 1958. [26] M. A. Van Driel, J. Bruggeman, G. Vriend, H. G. Brunner and J. A. Leunissen, A text-mining analysis of the human phenome, Eur. J. Hum. Genet. 14 (2006) 535 - 542, https://doi.org/10.1038/sj.ejhg.5201585. | ||
|
آمار تعداد مشاهده مقاله: 215 تعداد دریافت فایل اصل مقاله: 222 |
||
