Disputation - Erik Spånberg

Disputation

Datum: fredag 25 november 2022

Tid: 13.00 – 15.30

Plats: Hörsal 4, hus 2, Albano, Albanovägen 18

Variational Inference of Dynamic Factor Models

 

Abstract

When we make difficult and crucial decisions, forecasts are powerful and important tools. For that purpose, statistical models can be our most effective aid. Ideally, these models can incorporate large sets of multifaceted data. However, time and computational power may limit our ability to utilize such tools effectively, not least in changeable situations that require frequent updates.

This thesis takes a popular model for big data analysis, the dynamic factor model (DFM), and makes it more viable for fast-paced analytics in practice. The DFM assumes that data is driven by latent and unobserved dynamic factors. By estimating these factors, we may gain deeper insights of our data and ultimately predictive power. An appealing method for DFM-estimation is Bayesian inference, producing probability distributions (called posterior distributions) of factors and parameters.

We develop variational inference, which approximates Bayesian inference, in order to estimate DFMs very quickly. By this method, DFMs can be estimated in a fraction of computational time relative to standard approaches; hourly long estimations reduce to seconds or a few minutes.

Additionally, we allow for any arbitrary missing data pattern. We can consider data of various sizes and shapes, including different sample sizes, unsynchronized publications and mixed frequencies.

In the first paper, we develop an “Expectation Maximization” algorithm to find the maximum point of the posterior distribution of DFM parameters. This can be seen as a reduced case of variational inference. A simulation study shows that the method is preferable to the more common maximum likelihood approach.

In the second paper we develop variational inference of standard DFMs. Empirical examples show that this method approximates posterior distributions of factors and parameters very well and quickly. Predictive distributions, both in and out of sample, are practically indistinguishable from standard counterparts of much slower simulation techniques.

In the third paper we extend the approach to explicitly sort the relevant and irrelevant parts of the data, corresponding to each individual factor. We employ so called “spike-and-slab” priors, such that individual connections between factors and data can be switched on or off. These “switches” become part of the estimation problem. Simulation studies show that the method identifies the connections well.

The fourth paper is an application. We construct a very large DFM to predict Swedish macroeconomy, including 250 quarterly and 500 monthly times series, with different sample sizes and publication dates. To our knowledge, this is the largest prediction model on Swedish macroeconomy to date. We introduce a non-dogmatic structural framework, where we direct the analysis to certain features without strictly deciding them. This produces factor estimates interpretable in terms of consumption, production, prices, financial markets and more. Each time series and forecast can be decomposed according to these interpretations. A forecast evaluation shows good prediction precision of blocks of time series, as well as some key series individually. The model can be updated quickly, making it operable in practice, due to the speed acquired from variational inference.