Compact data structures for raster data.

Loading...
Thumbnail Image

Date

2025

Journal Title

Journal ISSN

Volume Title

Publisher

Universidad de Concepción

Abstract

A raster model consists of a matrix of cells in which each cell contains a value that represents information. A raster time series is an ordered collection of independent rasters. Raster and raster time series models have been applied in different domains, such as terrain altitude, temperature, atmospheric pressure, images, etc. Usually, the raster time series represents the raster data changes over time. The main attribute of those models is the data locality. Data locality indicates that con tiguous cells (spatially or temporally) contain similar values. Compression data exploits this characteristic. Besides, compact data structures allow the data to be compacted and respond to different queries without descompacting. Compact data structures have been used to rep resent raster data and raster time series, but it is possible to improve the efficiency of these structures. The work on this thesis focuses on improving the raster and raster time series compres sion. The thesis introduces an algorithm for constructing the Heuristic T-k2-raster that in tegrates a dynamic-programming approach, and proposes a new compact data structure for raster time series, the ZT-k2-raster, based on the k2-raster model. Additionally, alternative heuristic construction algorithms that incorporate clustering techniques have been developed to exploit temporal patterns better. The work also includes the implementation of a massively parallel k2-raster construction algorithm for GPGPU architectures, together with a comprehen sive experimental evaluation of all proposed methods and their performance. The experimental results confirm that these strategies improve compression and represen tation under specific conditions. The dynamic-programming approach consistently outperforms the baseline heuristic on synthetic datasets with high temporal and low spatial locality. In con trast, the clustering-based heuristics provide clear advantages for datasets with cyclic temporal patterns. The analysis of compressibility measures reveals a strong positive correlation with raster size and a strong negative correlation with spatial locality. Furthermore, a new alphabet sensitive 2D compressibility measure is introduced to better characterize raster compressibility. Finally, ZT-k2-raster achieves a competitive space–time trade-off for representing raster time series, and the parallel implementation demonstrates substantial acceleration during the con struction phase.

Description

Tesis presentada para optar al grado de Doctor/a en Ciencias de la Computación.

Keywords

Raster data, Data compression (Computer science), Geospatial data

Citation

URI

Collections