We consider a high-dimensional model in which variables are observed over time and space. The model consists of a spatio-temporal regression containing a time lag and a spatial lag of the dependent variable. Unlike classical spatial autoregressive models, we do not rely on a predetermined spatial interaction matrix, but infer all spatial interactions from the data. Assuming sparsity, we estimate the spatial and temporal dependence fully data-driven by penalizing a set of Yule–Walker equations. This regularization can be left unstructured, but we also propose customized shrinkage procedures when observations originate from spatial grids (e.g. satellite images). Finite sample error bounds are derived and estimation consistency is established in an asymptotic framework wherein the sample size and the number of spatial units diverge jointly. Exogenous variables can be included as well. A simulation exercise shows strong finite sample performance compared to competing procedures. As an empirical application, we model satellite measured nitrogen dioxide (NO2) concentrations in London. Our approach delivers forecast improvements over a competitive benchmark and we discover evidence for strong spatial interactions.