Vector data cubes

The concept of vector data cubes represents a multidimensional data structure designed to manage, analyze, and query vector-based geospatial data across spatial, temporal, and thematic dimensions. This model extends the well-established framework of raster data cubes—commonly used in remote sensing and Earth observation—to the domain of vector data, which comprises geometries such as points, lines, and polygons that represent discrete spatial features.

Vector data cubes are structured to encapsulate geospatial features that evolve over time and carry associated attributes. Each element of the cube corresponds to a spatiotemporal observation of a vector object, characterized by its geometry, timestamp, and a set of descriptive attributes. Formally, a vector data cube can be described as a collection of tuples

\[ VC = {(g_i, t_j, a_{i_j})} \]

where \(g_i\) denotes the spatial geometry, \(t_j\) are temporal instances, and \(a_{i_j})\) the vector of thematic attributes associated with that geometry at that point in time.

The spatial dimension captures the geometric representation of real-world entities (e.g., land parcels, administrative units), while the temporal dimension allows for tracking changes over time, and the attribute dimensions store thematic variables of interest (e.g., land use class, vegetation type). Vector data cubes are particularly advantageous in contexts where object-based spatial representations are more appropriate than gridded data.

The sits package supports a restricted version of vector data cubes. In sits, the polygons that make up a vector data cube are calculated using all temporal instances of the data cube. One of the key measures used to build vectors from raster data cube is the local similarity between pixel values. In sits, these similares are computed from all bands and all dates of the data cube.

This representation is useful for classification of land parcels whose boundaries have been computed by segmentation algorithms or which are part of a land cadastre. The basic idea is that sits will combine raster and vector information by classifying polygons based on time series defined by their interior pixels. In the following chapters, we will describe how sits can attach a vector data structure (e.g., a polygon shapefile) to an existing raster data cube. We will also show how to extract polygons based on segmentation methods such as SLIC superpixels algorithm.