by Sylvain Barde, University of Kent. Discussion paper KDPE 1908, June 2019.

**Non-technical summary:**

The paper develops and tests a multivariate extension of the Markov Information Criterion (MIC)

originally developed in Barde (2017). The main motivation for the MIC is the problem of comparing

the distance between a set of models and some empirical data for cases where estimation of the models with traditional methods is not feasible. This is often the case for simulation models such as agentbased models. The MIC performs this measurement by mapping the simulated data to the markov transition matrix of the underlying data generating process, and is proven to perform optimally (i.e. the measurement is unbiased in expectation) for all models reducible to a markov process. As a result, not only can the MIC provide a measure of distance solely on the basis of simulated data, but it can do it for a very wide class of data generating processes. This is illustrated in Barde (2016), which performs a comparison exercise between three agent based models (ABM) of financial markets and a set of ARCH-like models in order to rank them in terms of empirical performance.

The main drawback of the MIC in its original form is that the measurement of the informational

distance to the data can only be carried out for univariate models, such as the ABM models of financial series mentioned above. In principle, there is no conceptual problem with extending the MIC to multivariate models, as the state of a markov process can be described by a vector of variables, rather than a single variable. In practice, however, increasing the number of variables needed to describe the state of system leads to a combinatorial explosion in the memory requirement of the context-tree weighting (CTW) algorithm of Willems et al. (1995), which forms the basis of the MIC. As a consequence, a naïve extension to multivariate measurements is not possible. Instead, the paper uses a combination of three strategies to overcome this curse of dimensionality and extend the MIC to multiple variables.

- The first uses the fact that the most significant bit (MSB) of a context observation is more informative than the least significant bit (LSB). We therefore start by permuting the bits of the context so that the MSBs are processed first and the LSBs are processed last.
- Following from this, the second strategy is to truncate the context in order to keep the memory

requirement bounded to a tractable level, and to prune single observation branches of the context tree, following the suggestion of Willems and Tjalkens (1997), in order to keep the tree as small as possible. - Finally, because this truncation is expected to worsen the accuracy of the measurement, the final

strategy is to take the average of multiple measurements in order to increase the performance.

Crucially, this can be done simply by changing the order in variables are conditioned on, and does not require additional simulated or empirical data.

The extended methodology is validated by running two monte carlo model comparisons on VAR and

DSGE models in order to evaluate ability of the multivariate MIC to rank these data generating processes relative to traditional methods. These validations establish that the desirable properties of the univariate MIC can be preserved despite the large increase in the state space and the smaller amount of data. Finally, we carry out a proof-of-concept macroeconomic model comparison exercise to demonstrate that the MIC can enable the direct comparison of ABM and DSGE models, which is a crucial step towards increasing the policy relevance of ABMs

You can download the complete paper here.