Uncertainty estimation for time series classification: Exploring predictive uncertainty in transformer-based models for variable stars.

Cabrera Vives, Guillermo FelipeCádiz Leyton, Martina Alicia2024-11-262024-11-262024https://doi.org/10.29393/TMUdeC-151CM1UE151https://repositorio.udec.cl/handle/11594/10940Tesis presentada para optar al grado de Magíster en Ciencias de la Computación.We aim to enhance transformer-based models for classifying astronomical light curves by incorporating uncertainty estimation techniques to detect misclassified instances. We tested our methods on labeled datasets from MACHO, OGLE-III, and ATLAS, introducing a framework that significantly improves the reliability of automated classification for the next-generation surveys. We used a transformer-based encoder, Astromer, designed for capturing representations of single-band light curves. We enhanced its capabilities by applying three methods for quantifying uncertainty: Monte Carlo Dropout (MC Dropout), Hierarchical Stochastic Attention (HSA), and a novel hybrid method combining both approaches, which we have named Hierarchical Attention with Monte Carlo Dropout (HA-MC Dropout). We compared these methods against a baseline of Deep Ensembles (DEs). To estimate uncertainty estimation scores for the misclassification task, we selected Sampled Maximum Probability (SMP), Probability Variance (PV), and Bayesian Active Learning by Disagreement (BALD) as uncertainty estimates. When testing predictive performance, HA-MC Dropout outperforms the baseline, achieving macro F1-scores of 79.8 ± 0.5 on OGLE, 84 ± 1.3 on ATLAS, and 76.6 ± 1.8 on MACHO. In the misclassification detection task, it achieves the highest improvement of 8.5 ± 1.6 over the baseline using the PV score on OGLE-III.enCC BY-NC-ND 4.0 DEED Attribution-NonCommercial-NoDerivs 4.0 InternationalProcesamiento de datosAstrofísicaEstrellas variablesUncertainty estimation for time series classification: Exploring predictive uncertainty in transformer-based models for variable stars.Thesis