Detección del trastorno de déficit de atención e hiperactividad utilizando algoritmos de clasificación de machine learning basado en señales de electroencefalografía.
Loading...
Date
2025
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Universidad de Concepción
Abstract
El TDAH es uno de los trastornos psiquiátricos y neuroconductuales más comunes alrededor del mundo, afectando a millones de niños y adolescentes. Este proyecto de investigación implementó una metodología basada en aprendizaje automático que buscaba identificar y clasificar señales de EEG de sujetos con TDAH presentes en una base de datos de 121 participantes (61 TDAH, 60 control). Para desarrollar la metodología, se comenzó filtrando la señal y eliminando artefactos, luego se almacenaron y prepararon los datos para extraer las características en el dominio del tiempo (estadísticas, morfológicas y no lineales) y en el dominio de la frecuencia. La potencia de banda se calculó para 4 bandas de interés: delta, theta, alfa y beta aplicando el método Welch. Las características más relevantes fueron seleccionadas usando LASSO, lo que redujo significativamente la dimensión de la matriz de almacenamiento y además mejoró el rendimiento de los modelos de clasificación. También se realizó un análisis estadístico a las características más relevantes, evidenciando que predominaban en la zona frontal y parietal del lado derecho y además, presentaban más ritmo de ondas alfa y beta. Se evaluaron 4 clasificadores: SVM, Regresión Logística, Random Forest y Naive Bayes. Los resultados del desempeño de cada uno fueron presentados en términos de accuracy, precisión, recall y F1-score siendo el más robusto en sus resultados Regresión Logística con 85.83% de accuracy en validación cruzada de 10 folds. La metodología fue efectiva en esta tarea de clasificación de señales para identificar individuos con TDAH y podría ser utilizada en investigaciones posteriores.
ADHDis one of the most common psychiatric and neurodevelopmental disorders worldwide, affec ting millions of children and adolescents. This research project implemented a methodology based on machine learning aimed at identifying and classifying EEG signals from subjects with ADHD in a data set of 121 participants (61 ADHD, 60 control). To develop the methodology, the signals were first filtered and artifacts were removed, then the data was stored and prepared to extract features in both the time do main (statistical, morphological, and non-linear) and the frequency domain. Band power was calculated for 4 bands of interest: delta, theta, alpha, and beta using the Welch method. The most relevant features were selected using LASSO, which significantly reduced the storage matrix dimension and also impro ved the performance of the classification models. A statistical analysis was also performed on the most relevant features, showing that they predominated in the frontal and parietal areas of the right hemisphere and also exhibited more alpha and beta wave rhythms. Four classifiers were evaluated: SVM, Logistic Regression, Random Forest, and Naive Bayes. The performance results of each were presented in terms of accuracy, precision, recall, and F1-score, with Logistic Regression being the most robust in its results, achieving 85.83% accuracy in 10-fold cross-validation. The methodology was effective in this task of classifying signals to identify individuals with ADHD and could be used in future research.
ADHDis one of the most common psychiatric and neurodevelopmental disorders worldwide, affec ting millions of children and adolescents. This research project implemented a methodology based on machine learning aimed at identifying and classifying EEG signals from subjects with ADHD in a data set of 121 participants (61 ADHD, 60 control). To develop the methodology, the signals were first filtered and artifacts were removed, then the data was stored and prepared to extract features in both the time do main (statistical, morphological, and non-linear) and the frequency domain. Band power was calculated for 4 bands of interest: delta, theta, alpha, and beta using the Welch method. The most relevant features were selected using LASSO, which significantly reduced the storage matrix dimension and also impro ved the performance of the classification models. A statistical analysis was also performed on the most relevant features, showing that they predominated in the frontal and parietal areas of the right hemisphere and also exhibited more alpha and beta wave rhythms. Four classifiers were evaluated: SVM, Logistic Regression, Random Forest, and Naive Bayes. The performance results of each were presented in terms of accuracy, precision, recall, and F1-score, with Logistic Regression being the most robust in its results, achieving 85.83% accuracy in 10-fold cross-validation. The methodology was effective in this task of classifying signals to identify individuals with ADHD and could be used in future research.
Description
Tesis presentada para optar al título de Ingeniero/a Biomédico.
Keywords
Trastorno por déficit de atención con hiperactividad, Aprendizaje de máquina, Electroencefalografía