Gaussian Processes and Fast Matrix-Vector Multiplies

Iain Murray, 2009.

Gaussian processes (GPs) provide a flexible framework for probabilistic regression. The necessary computations involve standard matrix operations. There have been several attempts to accelerate these operations based on fast kernel matrix-vector multiplications. By focussing on the simplest GP computation, corresponding to test-time predictions in kernel ridge regression, we conclude that simple approximations based on clusterings in a kd-tree can never work well for simple regression problems. Analytical expansions can provide speedups, but current implementations are limited to the squared-exponential kernel and low-dimensional problems. We discuss future directions.

[PDF, DjVu, GoogleViewer, BibTeX]

Presented at the Numerical Mathematics in Machine Learning workshop at the 26th International Conference on Machine Learning (ICML 2009), Montreal, Canada. The talk, available online, covered slightly different material.

Related work carried out after this note was reported in a JMLR paper. Actually some work was done before this note, but getting negative results published is hard.

Since writing this note, some other promising approaches have appeared. The method of Ambikasaran et al. (2015) is available in George, and looks good for low-dimensional problems. Also Wang et al. (2019) seems to be a big step forwards, with demonstrations of some high-dimensional problems working in practice, and will be interesting to track.