Improving predictive models for Alzheimer's disease using GWAS data by incorporating misclassified samples modeling

Brissa Lizbeth Romero-Rosales, Jose Gerardo Tamez-Pena, Humberto Nicolini, Maria Guadalupe Moreno-Treviño, Victor Trevino*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

20 Citations (Scopus)


Late-onset Alzheimer's Disease (LOAD) is the most common form of dementia in the elderly. Genome-wide association studies (GWAS) for LOAD have open new avenues to identify genetic causes and to provide diagnostic tools for early detection. Although several predictive models have been proposed using the few detected GWAS markers, there is still a need for improvement and identification of potential markers. Commonly, polygenic risk scores are being used for prediction. Nevertheless, other methods to generate predictive models have been suggested. In this research, we compared three machine learning methods that have been proved to construct powerful predictive models (genetic algorithms, LASSO, and step-wise) and propose the inclusion of markers from misclassified samples to improve overall prediction accuracy. Our results show that the addition of markers from an initial model plus the markers of the model fitted to misclassified samples improves the area under the receiving operative curve by around 5%, reaching ~0.84, which is highly competitive using only genetic information. The computational strategy used here can help to devise better methods to improve classification models for AD. Our results could have a positive impact on the early diagnosis of Alzheimer's disease.

Original languageEnglish
Article numbere0232103
Pages (from-to)e0232103
JournalPLoS One
Issue number4
Publication statusPublished - Apr 2020

Bibliographical note

Funding Information:
Funding:Thisanalysiswaspartiallysupportedby the institutional grant Grupo de Investigacio ´n con EnfoqueEstrate ´ gicoenBioinforma ´ ticaparael Diagno ´ stico Clı ´ nico from Tecnolo ´ gico de Monterrey.ConsejoNacionaldeCienciay Tecnologı ´ a (CONACyT) provided scholarship 861461forBrissa-LizbethRomero-Rosales.

Funding Information:
This analysis was partially supported by the institutional grant Grupo de Investigaci?n con Enfoque Estrat?gico en Bioinform?tica para el Diagn?stico Cl?nico from Tecnol?gico de Monterrey. Consejo Nacional de Ciencia y Tecnolog?a (CONACyT) provided scholarship 861461 for Brissa-Lizbeth Romero-Rosales.

Publisher Copyright:
© 2020 Romero-Rosales et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

All Science Journal Classification (ASJC) codes

  • General


Dive into the research topics of 'Improving predictive models for Alzheimer's disease using GWAS data by incorporating misclassified samples modeling'. Together they form a unique fingerprint.

Cite this