Drug-Disease Association Analysis via Machine Learning‎ ‎on Extracted‎ ‎Features‎ ‎by‎ ‎Matrix Decomposition

Rafei, Zahra; Hosseini, Seyedeh Fatemeh; Yousefimehr, Behnam; Ghatee, Mehdi

doi:10.22052/mir.2025.256471.1506

	Drug-Disease Association Analysis via Machine Learning‎ ‎on Extracted‎ ‎Features‎ ‎by‎ ‎Matrix Decomposition
Mathematics Interdisciplinary Research
دوره 10، شماره 3، آذر 2025، صفحه 295-313 اصل مقاله (1.14 M)
نوع مقاله: Original Scientific Paper
شناسه دیجیتال (DOI): 10.22052/mir.2025.256471.1506
نویسندگان
Zahra Rafei؛ Seyedeh Fatemeh Hosseini؛ Behnam Yousefimehr؛ Mehdi Ghatee^*
‎Department of Mathematics and Computer Science, Amirkabir University of Technology, Tehran‎, ‎I‎. ‎R‎. ‎Iran
چکیده
‎Drug repurposing presents a cost-effective and time-efficient alternative to traditional drug discovery by identifying new therapeutic uses for existing medications‎. ‎As biomedical data grows in scale and complexity‎, ‎there is an increasing demand for predictive models that balance accuracy‎, ‎interpretability‎, ‎and computational efficiency‎. ‎In this study‎, ‎we systematically evaluate hybrid models that combine established matrix factorization techniques with machine learning regressors‎, ‎with an emphasis on interpretable and lightweight models such as the Decision Tree Regressor‎. ‎Using the widely adopted Fdataset‎, ‎comprising 1,933 known associations between 593 drugs and 313 diseases‎, ‎we demonstrate that several of these hybrid approaches achieve predictive performance comparable to or surpassing that of complex models like WNMFDDA‎, ‎while significantly reducing memory usage and training time‎. ‎Notably‎, ‎our framework relies solely on the drug–disease association matrix‎, ‎removing the dependency on auxiliary similarity data‎, ‎which is often unavailable in real-world applications‎. ‎Among the tested models‎, ‎the NMF DecisionTreeRegressor offers the highest accuracy‎, ‎making it ideal for accuracy-critical scenarios‎, ‎while the Ridge model stands out for its efficiency and suitability for resource-constrained environments‎. ‎To enhance transparency‎, ‎we further apply LIME (Local Interpretable Model-Agnostic Explanations) to provide interpretable insights into model predictions‎. ‎These findings highlight a practical and scalable framework for drug repurposing‎, ‎particularly suited for environments with limited computational resources‎. ‎Our approach supports the development of accessible‎, ‎data-driven predictive tools that accelerate the transition from computational modeling to clinical application‎.
کلیدواژه‌ها
Drug-disease association‎؛ ‎Machine learning‎؛ ‎Matrix decomposition‎؛ ‎Hybrid models‎

مراجع
[1] J. K. Yella, S. Yaddanapudi, Y. Wang and A. G. Jegga, Changing trends in computational drug repositioning, Pharmaceuticals 11 (2018) #57, https://doi.org/10.3390/ph11020057. [2] T. T. Ashburn and K. B. Thor, Drug repositioning: identifying and developing new uses for existing drugs, Nat. Rev. Drug Discov. 3 (2004) 673 - 683, https://doi.org/10.1038/nrd1468. [3] N. Nosengo, Can you teach old drugs new tricks? Nature 534 (2016) 314-316, https://doi.org/10.1038/534314a. [4] A. I. Graul, P. Pina, M. Tracy and L. Sorbera, The year’s new drugs and biologics 2019, Drugs Today 56 (2020) #47, https://doi.org/10.1358/dot.2020.56.1.3129707. [5] D. Sardana, C. Zhu, M. Zhang, R. C. Gudivada, L. Yang and A. G. Jegga, Drug repositioning for orphan diseases, Brief. Bioinform. 12 (2011) 346-356, https://doi.org/10.1093/bib/bbr021. [6] H. Yang, I. Spasic, J. A. Keane and G. Nenadic, A text mining approach to the prediction of disease status from clinical discharge summaries, J. Am. Med. Inform. Assoc. 16 (2009) 596- 600, https://doi.org/10.1197/jamia.M3096. [7] X. Chen and G. -Y. Yan, Semi-supervised learning for potential human microRNA-disease associations inference, Scientific reports 4 (2014) #5501, https://doi.org/10.1038/srep05501. [8] A. Gottlieb, G. Y. Stein, E. Ruppin and R. Sharan, PREDICT: a method for inferring novel drug indications with application to personalized medicine, Mol. Syst. Biol. 7 (2011) #496, https://doi.org/10.1038/msb.2011.26. [9] M. -N. Wang, X. -J. Xie, Z. -H. You, D. -W. Ding and L. Wong, A weighted non-negative matrix factorization approach to predict potential associations between drug and disease, J. Transl. Med. 20 (2022) #552, https://doi.org/10.1186/s12967-022-03757-1. [10] Z. Rafei, S. F. Hosseini, B. Yousefimehr, S. Tavakkoli and M. Ghatee, Optimizing drug-disease association analysis: a resource-efficient approach using numerical linear algebra and machine learning, Proceedings of the First International Conference on Machine Learning and Knowledge Discovery (MLKD 2024) (2024) 131 - 138. [11] D. S. Wishart, C. Knox, A. C. Guo, S. Shrivastava, M. Hassanali, P. Stothard, Z. Chang and J. Woolsey, DrugBank: a comprehensive resource for in silico drug discovery and exploration, Nucleic Acids Res. 34 (2006) D668 - D672, https://doi.org/10.1093/nar/gkj067. [12] A. Hamosh, A. F. Scott, J. Amberger, D. Valle and V. A. McKusick, Online mendelian inheritance in man (OMIM), Hum. Mutat. 15 (2000) 51 - 61, https://doi.org/10.1002/(SICI)1098-1004(200001)15:1<57::AIDHUMU12>3.0.CO;2-G. [13] H. -J. Jiang, Z. -H. You, K. Zheng and Z. -H. Chen, Predicting of drugdisease associations via sparse auto-encoder-based rotation forest, In International Conference on Intelligent Computing, Springer (2019) 369 - 380, https://doi.org/10.1007/978-3-030-26766-7_34. [14] C. -Q. Gao, Y. -K. Zhou, X. -H. Xin, H. Min and P. -F. Du, DDA-SKF: predicting drug-disease associations using similarity kernel fusion, Front. Pharmacol. 12 (2022) #784171, https://doi.org/10.3389/fphar.2021.784171. [15] G. Huang, Z. Liu, L. Van Der Maaten and K. Q. Weinberger, Densely connected convolutional networks, In Proceedings of the IEEE conference on computer vision and pattern recognition (2017) 4700 - 4708. [16] W. Zhang, X. Yue, W. Lin, W. Wu, R. Liu, F. Huang and F. Liu, Predicting drug-disease associations by using similarity constrained matrix factorization, BMC Bioinformatics 19 (2018) #233, https://doi.org/10.1186/s12859-018-2220-4. [17] H. Luo, C. Zhu, J. Wang, G. Zhang, J. Luo and C. Yan, Prediction of drug–disease associations based on reinforcement symmetric metric learning and graph convolution network, Front. Pharmacol. 15 #1337764, https://doi.org/10.3389/fphar.2024.1337764. [18] B. -W. Zhao, X. -R. Su, Y. Yang, D. -X. Li, G. -D. Li, P. -W. Hu, Y. -G. Zhao and L. Hu, Drug-disease association prediction using semantic graph and function similarity representation learning over heterogeneous information networks, Methods 220 (2023) 106 - 114, https://doi.org/10.1016/j.ymeth.2023.10.014. [19] V. T. Nguyen, D. H. Vu, T. K. P. Pham and T. H. Dang, CFMKGATDDA: A new collaborative filtering and multiple kernel graph attention network-based method for predicting drug-disease associations, Intelligence-Based Medicine, 11 (2025) #100194, https://doi.org/10.1016/j.ibmed.2024.100194. [20] D. D. Lee and H. S. Seung, Learning the parts of objects by non-negative matrix factorization, Nature 401 (1999) 788 - 791, https://doi.org/10.1038/44565. [21] L. Breiman, J. Friedman, R. A. Olshen and C. J. Stone, Classification and regression trees, Chapman and Hall/CRC, 2017. [22] A. E. Hoerl and R. W. Kennard, Ridge regression: Biased estimation for nonorthogonal problems, Technometrics 12 (1970) 55 - 67. [23] M. T. Ribeiro, S. Singh and C. Guestrin, " Why should I trust you?" explaining the predictions of any classifier, In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (2016) 1135 - 1144. [24] C. Steinbeck, Y. Han, S. Kuhn, O. Horlacher, E. Luttmann and E. Willighagen, Thec Chemistry development kit (CDK): An open-source java library for chemo- and bioinformatics, J. Chem. Inf. Comput. Sci. 43 (2003) 493 - 500, https://doi.org/10.1021/ci025584y. [25] T. T. Tanimoto, An Elementary Mathematical Theory of Classification and Prediction, International Business Machines Corporation, New York, 1958. [26] M. A. Van Driel, J. Bruggeman, G. Vriend, H. G. Brunner and J. A. Leunissen, A text-mining analysis of the human phenome, Eur. J. Hum. Genet. 14 (2006) 535 - 542, https://doi.org/10.1038/sj.ejhg.5201585.
آمار تعداد مشاهده مقاله: 215 تعداد دریافت فایل اصل مقاله: 222

سامانه مدیریت نشریات علمی دانشگاه کاشان

Drug-Disease Association Analysis via Machine Learning‎ ‎on Extracted‎ ‎Features‎ ‎by‎ ‎Matrix Decomposition