Indirect Inference

Last update: 21 Apr 2025 21:17
First version: 19 September 2010

A technique of parameter estimation for simulation models. You go and build a stochastic generative model of your favorite process or assemblage, and, being a careful scientist, you do a conscientious job of trying to include what you guess are all the most important mechanisms. The result is something you can step through to produce a simulation of the process of interest. But your model contains some unknown parameters, let's say generically $ \theta $, and you would like to tune those to match the data — or see if, despite your best efforts, there are aspects of the data which your model just can't match.

Very often, you will find that your model is too complicated for you to appeal to any of the usual estimation methods of statistics. Because you've been aiming for scientific adequacy rather than statistical tractability, it will often happen that there is no way to even calculate the likelihood of a given data set $ x_1, x_2, \ldots x_t \equiv x_1^t $ under parameters $ \theta $ in closed form, which would rule out even numerical likelihood maximization, to say nothing of Bayesian methods, should you be into them. (For concreteness, I am writing as though the data were just a time series, possibly vector-valued, but the ideas adapt in the obvious way to spatial processes or more complicated formats.) Yet you can simulate; it seems like there should be some way of saying whether the simulations look like the data.

This is where indirect inference comes in, with what I think is a really brilliant idea. Introduce a new model, called the "auxiliary model", which is mis-specified and typically not even generative, but is easily fit to the data, and to the data alone. (By that last I mean that you don't have to impute values for latent variables, etc., etc., even though you might know those variables exist and are causally important.) The auxiliary model has its own parameter vector $ \beta $, with an estimator $ \hat{\beta} $. These parameters describe aspects of the distribution of observables, and the idea of indirect inference is that we can estimate the generative parameters $ \theta $ by trying to match those aspects of observations, by trying to match the auxiliary parameters.

On the one side, start with the data $ x_1^t $ and get auxiliary parameter estimates $ \hat{\beta}(x_1^t) \equiv \hat{\beta}_t $. On the other side, for each $ \theta $ we can generate a simulated realization $ \tilde{X}_1^t(\theta) $ of the same size (and shape, if applicable) as the data, leading to auxiliary estimates $ \hat{\beta}(\tilde{X}_1^t(\theta)) \equiv \tilde{\beta}_t(\theta) $. The indirect inference estimate $ \hat{\theta} $ is the value of $ \theta $ where $ \tilde{\beta}_t(\theta) $ comes closest to $ \hat{\beta}_t $. More generally, we can introduce a (symmetric, positive-definite) matrix $ \mathbf{W} $ and minimize the quadratic form \[ \left(\hat{\beta}_t - \tilde{\beta}_t(\theta)\right) \cdot \mathbf{W} \left(\hat{\beta}_t - \tilde{\beta}_t(\theta)\right) \] with the entries in the matrix chosen to give more or less relative weight to the different auxiliary parameters.

The remarkable thing about this is that it works, in the sense of giving consistent parameter estimates, under not too strong conditions. Suppose that the data really are generated under some parameter value $ \theta_0 $; we'd like to see $ \hat{\theta} \rightarrow \theta_0 $. (Estimating the pseudo-truth in a mis-specified model works similarly but is more complicated than I feel like going into right now.) Sufficient conditions for this are that

the auxiliary estimates converge to a non-random "binding function" \[ \tilde{\beta}_t(\theta) \rightarrow b(\theta) \] uniformly in $ \theta $, and
the binding function $ b(\theta) $ is invertible.

(Really, both properties just need to hold in some suitable domain $ \Theta $ which includes $ \theta_0 $.)

Basically, these mean that the set of auxiliary parameters have to be rich enough to characterize or distinguish the different values of the generative parameters, and we need to be able to consistently estimate the former. This means we need at least as many auxiliary parameters as generative ones, so auxiliary models tend to be ones where it's easy to keep loading on parameters. (Adding too many auxiliary parameters does lead to loss of efficiency, however.) If $ b(\theta) $ is also differentiable in $ \theta $, and some additional regularity conditions hold, then we even get asymptotic Gaussian errors, with the matrix of partial derivatives $ \partial \beta_i/\partial \theta_j $ playing a role like the Fisher information matrix. — I can't resist adding that the usual conditions quoted for the consistency of indirect inference are stronger, and that these come from a chapter in the dissertation of my student Linqiao Zhao.

I think this is a really, really powerful idea, and one which should be much more widely adopted by people working with simulation models. In particular, one of my Cunning Plans is to make it work for agent-based modeling, and especially for models of social network formation.

A topic of particular interest to me is how to use non-parametric estimators, of regression or density curves say, as the auxiliary models, since then there is never any problem of having too few auxiliary parameters (though they might still be insensitive to the generative parameters, if one is looking the wrong curves). Nickl and Pötscher, below, have some initial results in this direction.

("Approximate Bayesian computation" is a very similar idea, but where ~~the plain truth of the evidence is corrupted by prejudice~~ a prior distribution is used to stabilize estimates, at some cost in sensitivity.)

--- Other techniques of simulation-based inference now get their own notebook.

(I wrote the first version of this sometime before 19 September 2010...)

Christian Gouriéroux and Alain Monfort, Simulation-Based Econometric Methods [Review: By Indirection Find Direction Out]
Christian Gouriéroux, Alain Monfort and E. Renault, "Indirect Inference", Journal of Applied Econometrics 8 (1993): S85--S118 [JSTOR]
A. A. Smith, "Indirect Inference", in The New Palgrave Dictionary of Economics 2nd edition [PDF preprint]

Ernesto Carrella, Richard M. Bailey and Jens Koed Madsen, "Indirect inference through prediction", Journal of Artificial Societies and Social Simulation 23:1 (2020): 7, arxiv:1807.01579 [The idea here is to start with a big candidate pool of auxiliary statistics, sample $\theta$ values randomly, calculate the statistics on simulations from each $\theta$, and then use a penalized linear regression to learn to predict $\theta$ as a function of the $\hat{\beta}$. (The point of the penalization is to stabilize the learned function, and ideally to do some variable selection.) I can well believe that this is computationally nicer than optimization, but it's (implicitly) learning a linear approximation to the inverse binding function, i.e., to $b^{-1}(\beta)$. Unless $b^{-1}$ really is linear, though, the best linear approximation learned by regression will change with the distribution of $\theta$. Maybe this could be iterated, to get a succession of linear approximations over successively smaller domains, hopefully shrinking towards the ideal estimate.]
Giovanni Luca Ciampaglia, "A framework for the calibration of social simulation models", Advances in Complex Systems accepted, arxiv:1305.3842 [At last, somebody's doing this!!!]
Jean-Jacques Forneron, Serena Ng, "The ABC of Simulation Estimation with Auxiliary Statistics", arxiv:1501.01265
David T. Frazier and Bonsoo Koo, "Indirect inference for locally stationary models", Journal of Econometrics 223 (2021): 1--27 [Comments/queries]
Florian Gach, Benedikt M. Pötscher, "Non-Parametric Maximum Likelihood Density Estimation and Simulation-Based Minimum Distance Estimators", arxiv:1012.3851 [See the comments on the earlier paper by Nickl and Pötscher below.]
Bruce E. Kendall, Stephen P. Ellner, Edward Mccauley, Simon N. Wood, Cheryl J. Briggs, William W. Murdoch and Peter Turchin "Population Cycles in the Pine Looper Moth: Dynamical Tests of Mechanistic Hypotheses", Ecological Monographs 75 (2005): 259--276 [PDF reprint. I learned about indirect inference by hearing Prof. Ellner talk about this paper at the 2007 Montreal workshop on statistics for dynamical systems.]
Richard Nickl, Benedikt M. Pötscher, "Efficient Simulation-Based Minimum Distance Estimation and Indirect Inference", arxiv:0908.0433 [Proving that by using a particular nonparametric density estimator, based on the method of sieves, as the auxiliary estimator, the indirect-inference estimator becomes as efficient, asymptotically, as maximum likelihood. This is impressive, but they have to assume the process being simulated gives IID data, and generalizing would not be easy (at least not for me).
Simon N. Wood, "Statistical inference for noisy nonlinear ecological dynamic systems", Nature 466 (1102--1104)

CRS, "A Note on Simulation-Based Inference by Matching Random Features", arxiv:2111.09220
My lecture notes on indirect inference (most recent version)

Linqiao Zhao, A Model of Limit-Order Book Dynamics and a Consistent Estimation Procedure, Ph.D. thesis, Statistics Department, Carnegie Mellon University, 2010 [PDF]

Marianne Bruins, James A. Duffy, Michael P. Keane, Anthony A. Smith Jr, "Generalized Indirect Inference for Discrete Choice Models", arxiv:1507.06115
Ernesto Carrella, "No Free Lunch when Estimating Simulation Parameters", Journal of Artificial Societies and Social Simulation 24:2 (2021): 7
D. R. Cox and Christiana Kartsonaki, "The fitting of complex parametric models", Biometrika 99 (2012): 741--747
Veronika Czellar and Elvezio Ronchetti, "Accurate and robust tests for indirect inference", Biometrika 97 (2010): 621--630
Christopher C. Drovandi, Anthony N. Pettitt, Malcolm J. Faddy, "Approximate Bayesian computation using indirect inference", Journal of the Royal Statistical Society C 60 (2011): 317--337
Christopher C. Drovandi, Anthony N. Pettitt, and Anthony Lee, "Bayesian Indirect Inference Using a Parametric Auxiliary Model", Statistical Science 30 (2015): 72--95
Mark Girolami, Anne-Marie Lyne, Heiko Strathmann, Daniel Simpson, Yves Atchade, "Playing Russian Roulette with Intractable Likelihoods", arxiv:1306.4032