Prediction of Multiple sclerosis disease using machine learning classifiers: a comparative study


Machine Learning
Multiple Sclerosis


INTRODUCTION: Hamedan Province is one of Iran's high-risk regions for Multiple Sclerosis (MS). Early diagnosis of MS based on an accurate system can control the disease. The aim of this study was to compare the performance of four machine learning techniques with traditional methods for predicting MS patients.

METHODS: The study used information regarding 200 patients through a case-control study conducted in Hamadan, Western Iran, from 2013 to 2015. The performance of six classifiers was used to compare their performance in terms of sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio (LR+), negative likelihood ratio (LR-) and total accuracy.

RESULTS: Random Forest (RF) model illustrated better performance among other models in both scenarios. It had greater specificity (0.67), PPV (0.68) and total accuracy (0.68). The most influential diagnostic factors for MS were age, birth season and gender.

CONCLUSIONS: Our findings showed that despite all the six methods performed almost similarly, the RF model performed slightly better in terms of different criteria in prediction accuracy. Accordingly, this approach is an effective classifier for predicting MS in the early stage and control the disease.


1. Poorolajal J, Mazdeh M, Saatchi M, Ghane ET, Biderafsh A, Lotfi B, et al. Multiple sclerosis associated risk factors: a case-control study. Iranian journal of public health. 2015;44(11):1498. doi: 10.1007/s10072-003-0147-6.
2. Etemadifar M, Sajjadi S, Nasr Z, Firoozeei TS, Abtahi SH, Akbari M, et al. Epidemiology of multiple sclerosis in Iran: a systematic review. Eur Neurol. 2013;70(5-6):356-63. doi: 10.1159/000355140.
3. Harbo HF, Gold R, Tintoré M. Sex and gender issues in multiple sclerosis. Therapeutic advances in neurological disorders. 2013;6(4):237-48. doi: 10.1177/1756285613488434.
4. Di Cara M, Lo Buono V, Corallo F, Cannistraci C, Rifici C, Sessa E, et al. Body image in multiple sclerosis patients: a descriptive review. Neurol Sci. 2019;40(5):923-8. doi: 10.1007/s10072-019-3722-1.
5. Moghtaderi A, Rakhshanizadeh F, Shahraki-Ibrahimi S. Incidence and prevalence of multiple sclerosis in southeastern Iran. Clin Neurol Neurosurg. 2013;115(3):304-8. doi: 10.1016/j.clineuro.2012.05.032.
6. Rezaali S, Khalilnezhad A, Naser Moghadasi A, Chaibakhsh S, Sahraian MA. Epidemiology of multiple sclerosis in Qom: Demographic study in Iran. Iran J Neurol. 2013;12(4):136-43.
7. Etemadifar M, Abtahi SH. Multiple sclerosis in Isfahan, Iran: Past, Present and Future. Int J Prev Med. 2012;3(5):301-2. doi: 10.1159/000094235.
8. Sellner J, Kraus J, Awad A, Milo R, Hemmer B, Stuve O. The increasing incidence and prevalence of female multiple sclerosis--a critical analysis of potential environmental factors. Autoimmun Rev. 2011;10(8):495-502. doi: 10.1016/j.autrev.2011.02.006.
9. Ramsaransing GS, Mellema SA, De Keyser J. Dietary patterns in clinical subtypes of multiple sclerosis: an exploratory study. Nutr J. 2009;8(1):36. doi: 10.1186/1475-2891-8-36.
10. Guaschino C, Esposito F, Liberatore G, Colombo B, Annovazzi P, D'Amico E, et al. Familial clustering in Italian progressive-onset and bout-onset multiple sclerosis. Neurol Sci. 2014;35(5):789-91. doi: 10.1007/s10072-014-1650-7.
11. Vazirinejad R, Sotoudeh-Maram E, Soltanzadeh AA, Taghavi M-M. The effect of childhood viral infections on the incidence of multiple sclerosis. Zahedan Journal of Research in Medical Sciences. 2013;15(2):24-7.
12. Mikaeloff Y, Caridade G, Suissa S, Tardieu M, Group KS. Clinically observed chickenpox and the risk of childhood-onset multiple sclerosis. Am J Epidemiol. 2009;169(10):1260-6. doi: 10.1093/aje/kwp039.
13. Handel AE, Williamson AJ, Disanto G, Dobson R, Giovannoni G, Ramagopalan SV. Smoking and multiple sclerosis: an updated meta-analysis. PLoS One. 2011;6(1):e16149. doi: 10.1371/journal.pone.0016149.
14. Orton SM, Wald L, Confavreux C, Vukusic S, Krohn JP, Ramagopalan SV, et al. Association of UV radiation with multiple sclerosis prevalence and sex ratio in France. Neurology. 2011;76(5):425-31. doi: 10.1212/WNL.0b013e31820a0a9f.
15. Pakenham KI. Making sense of illness or disability: the nature of sense making in multiple sclerosis (MS). J Health Psychol. 2008;13(1):93-105. doi: 10.1177/1359105307084315.
16. World Health Organization. Breast cancer: breast cancer and early diagnosis. Available from: URL: http://wwwwhoint/cancer/prevention/diagnosis-screening/breast-cancer/en/. 2018.
17. Zhao Y, Healy BC, Rotstein D, Guttmann CR, Bakshi R, Weiner HL, et al. Exploration of machine learning techniques in predicting multiple sclerosis disease course. PLoS One. 2017;12(4):e0174866. doi: 10.1371/journal.pone.0174866.
18. Boucekine M, Loundou A, Baumstarck K, Minaya-Flores P, Pelletier J, Ghattas B, et al. Using the random forest method to detect a response shift in the quality of life of multiple sclerosis patients: a cohort study. BMC medical research methodology. 2013;13(1):20. doi: 10.1186/1471-2288-13-20.
19. Ramana BV, Babu MSP, Venkateswarlu N. A critical study of selected classification algorithms for liver disease diagnosis. International Journal of Database Management Systems. 2011;3(2):101-14. doi: 10.5121/ijdms.2011.3207.
20. Hashemian AH, Beiranvand B, Rezaei M, Bardideh A, Zand-Karimi E. Comparison of artificial neural networks and cox regression models in prediction of kidney transplant survival. International Journal of Advanced Biological and Biomedical Research. 2013;1(10):1204-12.
21. Montazeri M, Montazeri M, Naji HR, Faraahi A, editors. A novel memetic feature selection algorithm. Information and Knowledge Technology (IKT), 2013 5th Conference on; 2013: IEEE.
22. Das R, Sengur A. Evaluation of ensemble methods for diagnosing of valvular heart disease. Expert Systems with Applications. 2010;37(7):5110-5. doi: 10.1016/j.eswa.2009.12.085.
23. Alkim E, Gurbuz E, Kilic E. A fast and adaptive automated disease diagnosis method with an innovative neural network model. Neural Netw. 2012;33:88-96. doi: 10.1016/j.neunet.2012.04.010.
24. Zheng B, Yoon SW, Lam SS. Breast cancer diagnosis based on feature extraction using a hybrid of K-means and support vector machine algorithms. Expert Systems with Applications. 2014;41(4):1476-82. doi: 10.1016/j.eswa.2013.08.044.
25. Montazeri M, Montazeri M, Montazeri M, Beigzadeh A. Machine learning models in breast cancer survival prediction. Technology and Health Care. 2016;24(1):31-42. doi: 10.3233/THC-151071.
26. Chao C-M, Yu Y-W, Cheng B-W, Kuo Y-L. Construction the model on the breast cancer survival analysis use support vector machine, logistic regression and decision tree. Journal of medical systems. 2014;38(10):106. doi: 10.1007/s10916-014-0106-1.
27. Ferrer L, Rondeau V, Dignam J, Pickles T, Jacqmin-Gadda H, Proust-Lima C. Joint modelling of longitudinal and multi-state processes: application to clinical progressions in prostate cancer. Stat Med. 2016;35(22):3933-48. doi: 10.1002/sim.6972.
28. Finch H, Schneider MK. Classification accuracy of neural networks vs. discriminant analysis, logistic regression, and classification and regression trees. Methodology. 2007;3(2):47-57.
29. El Khoury Y, Collongues N, De Sèze J, Gulsari V, Patte-Mensah C, Marcou G, et al. Serum-based differentiation between multiple sclerosis and amyotrophic lateral sclerosis by Random Forest classification of FTIR spectra. Analyst. 2019;144(15):4647-52. doi: 10.1039/c9an00754g.
30. Asadollahi S, Fakhri M, Heidari K, Zandieh A, Vafaee R, Mansouri B. Cigarette smoking and associated risk of multiple sclerosis in the Iranian population. J Clin Neurosci. 2013;20(12):1747-50. doi: 10.1016/j.jocn.2013.01.018.
31. Friedman M, Rosenman RH. Type A behavior and your heart: Fawcett; 1974.
32. Shaygannejad V, Dehnavi SR, Ashtari F, Karimi S, Dehghani L, Meamar R, et al. Study of type a and B behavior patterns in patients with multiple sclerosis in an Iranian population. Int J Prev Med. 2013;4(Suppl 2):S279-83.
33. Bellaachia A, Guven E. Predicting breast cancer survivability using data mining techniques. Age. 2006;58(13):10-110.
34. Witten IH, Frank E. Data mining: practical machine learning tools and techniques with Java implementations. Acm Sigmod Record. 2002;31(1):76-7.
35. Auria L, Moro RA. Support vector machines (SVM) as a technique for solvency analysis. 2008. doi: 10.2139/ssrn.1424949. doi: 10.2139/ssrn.1424949.
36. Tripathy RK, Zamora-Mendez A, de la O S, José A, Paternina MRA, Arrieta JG, et al. Detection of life threatening ventricular arrhythmia using digital taylor fourier transform. Frontiers in physiology. 2018;9:722. doi: 10.3389/fphys.2018.00722.
37. Breiman L. Random forests. Machine learning. 2001;45(1):5-32.
38. Agresti A. Categorical data analysis: John Wiley & Sons; 2003.
39. Izenman A. Linear discriminant analysis. Modern multivariate statistical techniques. Springer New York; 2013.
40. Yitzhaki S, Schechtman E. The Gini methodology: A primer on a statistical methodology: Springer Science & Business Media; 2012.
41. Lin K, Hu Y, Kong G. Predicting in-hospital mortality of patients with acute kidney injury in the ICU using random forest model. International journal of medical informatics. 2019;125:55-61. doi: 10.1016/j.ijmedinf.2019.02.002.
42. Xu W, Zhang J, Zhang Q, Wei X, editors. Risk prediction of type II diabetes based on random forest model. 2017 Third International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB); 2017: IEEE.
43. Tullman MJ. Overview of the epidemiology, diagnosis, and disease progression associated with multiple sclerosis. Am J Manag Care. 2013;19(2 Suppl):S15-20.
44. Sanai SA, Saini V, Benedict RH, Zivadinov R, Teter BE, Ramanathan M, et al. Aging and multiple sclerosis. Mult Scler. 2016;22(6):717-25. doi: 10.1177/1352458516634871.
45. Cruz PMR, Matthews L, Boggild M, Cavey A, Constantinescu CS, Evangelou N, et al. Time-and region-specific season of birth effects in multiple sclerosis in the United Kingdom. JAMA neurology. 2016;73(8):954-60. doi: 10.1001/jamaneurol.2016.1463.
46. Walleczek NK, Frommlet F, Bsteh G, Eggers C, Rauschka H, Koppi S, et al. Month-of-birth-effect in multiple sclerosis in Austria. Mult Scler. 2019;25(14):1870-7. doi: 10.1177/1352458518810924.
47. Staples J, Ponsonby AL, Lim L. Low maternal exposure to ultraviolet radiation in pregnancy, month of birth, and risk of multiple sclerosis in offspring: longitudinal analysis. BMJ. 2010;340:c1640. doi: 10.1136/bmj.c1640.
48. Pantavou KG, Bagos PG. Season of birth and multiple sclerosis: a systematic review and multivariate meta-analysis. Journal of neurology. 2019:1-8. doi: 10.1007/s00415-019-09346-5.
49. Kearns PKA, Paton M, O'Neill M, Waters C, Colville S, McDonald J, et al. Regional variation in the incidence rate and sex ratio of multiple sclerosis in Scotland 2010-2017: findings from the Scottish Multiple Sclerosis Register. J Neurol. 2019;266(10):2376-86. doi: 10.1007/s00415-019-09413-x.
50. Leray E, Moreau T, Fromont A, Edan G. Epidemiology of multiple sclerosis. Revue neurologique. 2016;172(1):3-13. doi: 10.1016/j.neurol.2015.10.006.
51. McCombe PA, Greer JM. Female reproductive issues in multiple sclerosis. Mult Scler. 2013;19(4):392-402. doi: 10.1177/1352458512452331.