Abstract
This paper develops a statistically principled approach to kernel density estimationon a network of lines, such as a road network. Existing heuristic techniques are reviewed, andtheir weaknesses are identified. The correct analogue of the Gaussian kernel is the ‘heat kernel’,the occupation density of Brownian motion on the network. The corresponding kernel estimatorsatisfies the classical time-dependent heat equation on the network. This ‘diffusion estimator’ hasgood statistical properties that follow from the heat equation. It is mathematically similar to anexisting heuristic technique, in that both can be expressed as sums over paths in the network. How-ever, the diffusion estimate is an infinite sum, which cannot be evaluated using existing algorithms.Instead, the diffusion estimate can be computed rapidly by numerically solving the time-dependentheat equation on the network. This also enables bandwidth selection using cross-validation. Thediffusion estimate with automatically selected bandwidth is demonstrated on road accident data.
Original language | English |
---|---|
Pages (from-to) | 324-345 |
Number of pages | 22 |
Journal | Scandinavian Journal of Statistics: theory and applications |
Volume | 44 |
Issue number | 2 |
DOIs | |
Publication status | Published - Jun 2017 |