Models for Spatial Data

The first two parts of this book contain a considerable amount of concepts that one could classify as “models for spatial data”, including:

how numbers relate to real-world phenomena (2 Coordinates)
how coordinates are defined in different spaces (2 Coordinates, 4 Spherical Geometries)
simple feature geometries (3 Geometries), how straight lines between points can be used to define linestrings and polygons
the set of geometry types (3.1 Simple feature geometries)
support and the way feature attributes can relate to geometries (5 Attributes and Support)
how simple tesselations can describe space-time volumes (6 Data Cubes)
how these concepts can be made operational using data science software (7 Introduction to sf and stars)

The third and largest part of this book is dedicated to statistical modelling of spatial data. The scientific discipline statistics is concerned with describing and understanding variability in observations, and predicting future observations. Observations are often modelled as

\[\mbox{observed}=\mbox{explained}+\mbox{remainder}\]

where “remainder” refers to variation that could not be explained by predictors, including measurement error but also lack of fit or variation caused by model misspecification. For spatial data, a further term is often helpful, as in

\[\mbox{observed}=\mbox{explained} + \mbox{smooth} + \mbox{remainder}\]

where “smooth” refers to variation that is not explained by external predictors but that clearly shows “smooth” spatial patterns, as opposed to the “rough” remainder which does not do this. Such a “smooth” term can for instance be modelled by base functions in coordinates (splines, smoothers) or as a random term that is spatially correlated.

10 Statistical modelling of spatial data introduces statistical modelling of spatial data, as a preparation to the subsequent chapters but also highlighting a number of relevant aspects that are not elaborated on in later chapters. It tries to bridge these chapters with concepts from the first part of this book, in particular support (5 Attributes and Support).

It is now obvious that a complete and comprehensive treatment of the topic of statistical models for spatial data that also includes instructions about the use of computational software in a single book is an impossible task. The spatstat book (Baddeley, Rubak, and Turner 2015) has around 850 pages for only spatial point patterns and R. This part focuses on the three “classical” spatial statistics topics: analysis of point patterns (11 Point Pattern Analysis), geostatistical data (Chapters 12 Spatial Interpolation and 13 Multivariate and Spatiotemporal Geostatistics), and lattice (areal) data (Chapters 14 Proximity and Areal Data-17 Spatial Econometrics Models). Where possible we attempt to refer to further literature on methods and software implementations in R.

# Models for Spatial Data The first two parts of this book contain a considerable amount of concepts that one could classify as "models for spatial data", including: * how numbers relate to real-world phenomena (@sec-cs) * how coordinates are defined in different spaces (@sec-cs, @sec-spherical) * simple feature geometries (@sec-geometries), how straight lines between points can be used to define linestrings and polygons * the set of geometry types (@sec-simplefeatures) * support and the way feature attributes can relate to geometries (@sec-featureattributes) * how simple tesselations can describe space-time volumes (@sec-datacube) * how these concepts can be made operational using data science software (@sec-sf) The third and largest part of this book is dedicated to _statistical_ modelling of spatial data. The scientific discipline _statistics_ is concerned with describing and understanding variability in observations, and predicting future observations. Observations are often modelled as $$\mbox{observed}=\mbox{explained}+\mbox{remainder}$$ where "remainder" refers to variation that could not be explained by predictors, including measurement error but also lack of fit or variation caused by model misspecification. For spatial data, a further term is often helpful, as in $$\mbox{observed}=\mbox{explained} + \mbox{smooth} + \mbox{remainder}$$ where "smooth" refers to variation that is not explained by external predictors but that clearly shows "smooth" spatial patterns, as opposed to the "rough" remainder which does not do this. Such a "smooth" term can for instance be modelled by base functions in coordinates (splines, smoothers) or as a random term that is spatially correlated. @sec-modelling introduces statistical modelling of spatial data, as a preparation to the subsequent chapters but also highlighting a number of relevant aspects that are not elaborated on in later chapters. It tries to bridge these chapters with concepts from the first part of this book, in particular support (@sec-featureattributes). It is now obvious that a complete and comprehensive treatment of the topic of statistical models for spatial data that also includes instructions about the use of computational software in a single book is an impossible task. The `spatstat` book [@baddeley2015spatial] has around 850 pages for only spatial point patterns and R. This part focuses on the three "classical" spatial statistics topics: analysis of point patterns (@sec-pointpatterns), geostatistical data (Chapters [-@sec-interpolation] and [-@sec-stgeostatistics]), and lattice (areal) data (Chapters [-@sec-area]-[-@sec-spatecon]). Where possible we attempt to refer to further literature on methods and software implementations in R.