In practical applications of Micro-Simulation Models (MSMs), very little is usually known about the properties of the simulated values. This paper argues that we need to apply the same rigorous standards for inference in micro-simulation work as in scientific work generally. If not, then MSMs will loose in credibility. Differences between inference in static and dynamic models are noted and then the paper focuses on the estimation of behavioral parameters. There are four themes: calibration viewed as estimation subject to external constraints, piece wise versus system-wide estimation, simulation-based estimation and validation. © 2002 IMACS. Published by Elsevier Science
Micro-simulation has been used, for instance, in ‘economics’, ‘economic geography’ and ‘sociology’ to assess the impact of changes in economic and social policy on individuals, households and firms. In a rather detailed way these models usually include legislative rules as they apply at the micro level such as those for taxes and benefits, and sometimes also behavioral models to capture individual adjustments to legislative changes. They are applied to a sample of individuals (households, firms), which provide the starting values for a simulation. As the simulation proceeds the status of each individual is updated, for instance, whether alive, in work, eligible for benefits, etc., and what incomes are received and what taxes paid. The object of simulation could, for instance, be to estimate the change in government revenues following a tax change. The reason not to use a more aggregate approach is then that the highly nonlinear tax system interacts with the income distribution and with the composition of the population of tax payers in terms of their eligibility for deductions, and the applicability of thresholds and tax rates, such that a micro analysis is needed. Another object of simulation could be to analyze the distribution of incomes after tax. For instance, does a reduction in the tax rates at low incomes reduce the population share in poverty taking any consequential reductions in benefits and changes in work incentives into account?
One usually distinguishes between static models, in which the size and composition of the simulated population is not changed, and dynamic models, which include mechanisms that simulate changes in the population of individuals.
Inference in micro-simulation models (MSMs) is in principle no different from statistical inference generally, but in current practice the inference aspects have been neglected. Unknown parameters have not always been estimated using estimation methods with known properties, but values have been assigned by some calibration method. One has been satisfied if the model runs and approximately tracks observed data. The large size of a typical MSM and the difficulties to get coherent data has made many researchers and practitioners accept ad hoc methods.
There are thus practical problems with inference in MSM related to the large number of relations and conditions, the frequent use of nonstandard functional forms often including discontinuities, and the fact that data typically are obtained from many different sources.
Micro-simulation aims at statements about the distribution of some endogenous variables (for instance, the distribution of incomes) defined on a population (for instance, the population of Swedes in a particular year), given certain policy assumptions (for instance, assumptions about tax rates) and initial conditions. These initial conditions are usually given by a sample of individuals on which the MSM operates. In the simulation sample values are changed or updated, and the new sample values are used to estimate properties of the distribution of interest (for instance, a total, a mean or a Gini coefficient).
A proper inference usually involves several random experiments. One is drawing the sample of initial conditions, another is the random experiment or process assumed to generate population data, and third is the generation of random numbers in the simulation experiment. The choice of methods is also determined by the mode of inference, whether there is an inference to a finite population or a "super population".
Because the random experiments involved and the mode of inference in static micro-simulation, in general, is different from that of dynamic micro-simulation it is useful first to discuss inference in static models and then turn to dynamic models. Then follows a section on the estimation of behavioral models. Although primarily based on the "super population" thinking of dynamic models much of what is to be said about incorporating external information and simulation estimation also applies to static models. The paper ends with a few concluding remarks.
The simplest case of a static model is one without behavioral response relations. It only includes a set of deterministic rules, for instance, tax and benefit rules translated into computer code. The FASIT model of Statistics Sweden and the LAW model of Statistics Denmark are two examples. There is also currently work on a European model EUROMOD for all EU countries. Given a sample of pretax incomes it computes taxes, benefits and disposable incomes for each individual in the sample. In this case, there is no model-based inference but only an inference from the sample of initial conditions (pretax incomes) to the population from which this sample was drawn. In this case an inference to the finite population is meaningful and usually also desired.
If the sample of initial conditions is a probability sample this would seem to be a standard application of sampling theory. But usually, the sample was drawn from a population dated a few years ago while an inference is desired to a population, which is present today, and in general, these two populations differ.
This problem is usually handled by reweighting. The sample weights are adjusted such that a standard inference will reproduce the observed distribution of certain variables in the present population. One might, for instance, know that the age distribution and the distribution of schooling have changed and then seek to adjust the sampling weights accordingly. A technical approach to achieve this is calibration, see (Lindström, 1997; Lundström, 1997; Merz, 1993). The idea is to obtain new weights, which are so close to the old ones as possible, but make the simulated values aggregate to know totals. Closeness is defined by some measure of distance. The choice of distance measure is rather arbitrary but is has been shown that certain distance functions give estimators which are well-known in the sampling literature (Deville and Särndal, 1992; Lundström, 1997). Although calibration estimators aggregate to known totals this is no guarantee one obtains an inference to the desired population. The problem might remain if the knowledge of totals does not include the key variables of interest, or if not only the center of location but also the dispersion of key variables have changed. Without a thorough analysis of the causes to population changes any reweighting becomes ad hoc. Formulating a model, which captures causal relations, on the other hand leads into a dynamic MSM.
Static MSMs can also include behavioral relations, for instance, labor supply as a function of the budget set (incomes, taxes and benefits). A static model (in the usual economic sense) has no time dimension, but, in practice, a micro-simulation analyst wants to say something about a population in real time. It follows from the tax and benefit rules that a static tax-benefit model without behavioral relations gives the immediate, first-order effects of tax and benefit changes, but if behavioral relations are included, there is an issue about their interpretation. Does a labor supply relation, for instance, give the behavioral response which materialize within a year, or does it give the total accumulated effect until some steady state is reached? Most economists probably think of static models in the latter sense, but this raises new issues. To test and estimate such a model one needs a sample of individuals who have all reached a steady state. Is the adjustment process so quick that a random cross-section of individuals is suitable for inference?
There is no general and simple answer to this question, but let us assume that the adjustment process is almost immediate. An inference would then have to account both for the random uncertainty, which arises because the model is simulated on a sample of initial conditions, and the uncertainty which is generated by the estimated behavioral model. The latter will include two components. The first arises because the unknown parameters are estimated. The properties of these estimates depend on the properties of the model, how data were obtained and on what estimation method was used. The second component arises because invoking a random number generator simulates the estimated model. The properties of simulations from a static model and how to estimate various variance components is discussed in some detail in Klevmarken (1998). It is a problem that we do not know and cannot simulate the values of the exogenous variables of the observations not included in the sample. An inference has to be conditioned on the observed exogenous variables in the sample. If estimators can be written as sums of individual contributions then Horvitz-Thompson estimators are consistent, but for more general parameters there might not be any finite sample estimator (see Pudney and Sutherland, 1996).
In a dynamic MSM, there is no constant population to which an inference can be drawn, because the model defines how the population changes both in size and in composition. Only an inference to the super-population defined by the model would seem meaningful. Let’s write the model in the following way:
where y0 is a vector of initial conditions, in practice set by the sample on which the simulations are done.
Suppose we are interested in estimating
it is assumed that and y0 are independent for all s t and that f is known. µ(yt) is a statistic of interest. The distribution of y0 will, in practice, become estimated by the corresponding empirical distribution function obtained through the sample of initial conditions. One could either condition on the sample of initial conditions, if t is large and the model has some ergodic properties the influence of y0 becomes small, or one could use the bootstrap technique to evaluate the random influence from the choice of initial conditions sample.
In general, it will not be possible to evaluate expression (2) analytically, but by replicated drawings from the distribution of a number of replications of µ is obtained, and the mean of these µ-values is an unbiased estimate of expression (2). This procedure assumes that the parameters are known. In practice, they are not and have to be replaced by some estimates . This implies that the simulated estimate of (2) is a random function of . If µ and g satisfy certain regularity conditions1 and if is a consistent estimate of , then the estimate of E(µ) is consistent too. But even if is unbiased, the estimate of E(µ) is, in general, not unbiased, because µ and g are nonlinear functions. By replicating also over the domain of we might thus like to estimate . These replicated simulations will also give an estimate of the corresponding variance.
Our limited capacity as model builders, the difficulties to get good comprehensive data from which the model parameters can be estimated, and the piece wise approach usually adopted, in practice, to estimate the model sub-model by sub-model, all contribute to deviations of simulated values and distributions from observed data. To make the model "stay on track" some model builders have aligned their models to external benchmark data. Population totals and means from official statistics or estimates from surveys not used to estimate the model are sometimes used as benchmarks. If a model is to gain credibility with users they often require that the model is able to reproduce the basic demographic structure of the population and predict well-known benchmarks like for instance, the labor force participation rate, the unemployment rate, the mean and dispersion of disposable income, etc. For this reason, model builders have forced their models to predict these numbers without error. In the US model, CORESIM, for instance, adjusting the simulated values (and not the parameter estimates) does this alignment (see Caldwell, 1988; Caldwell, 1993; Caldwell, 1996).
Alignment is usually done by simple proportional adjustments, but there are also more sophisticated procedures. The ADJUST procedure developed by Merz (Merz, 1993; Merz, 1994) and originally designed for reweighting in static models (cf. above) might also be used for alignment. However, in this context the whole approach appears even more ad hoe than when it is used for reweighting. It does not consider the stochastic properties of the model at all.
A natural way to incorporate this kind of externally given information is to look upon the estimation problem as one of constrained estimation. For linear models, this approach is discussed in many textbooks of statistics and econometrics. Assume the following simple model:
where Yt and Zt are n x 1 and n x k matrices. Also, assume a total is known for period . This information can be used as a constraint on the least-squares estimates of . The constraint then becomes
J is a unit vector, and N the size of the population for which the total applies. When the constrained least-squares estimator is used for prediction, the predictions can be written on the following form:
They are BLUP within the class that satisfies the constraint. Simulated values are obtained by adding random errors with mean zero drawn from an appropriate probability distribution. The matrix in curly brackets is a matrix of alignment factors. The simulated Y-values will give a total for the period equal to the known total except for a small simulation error with mean zero, which is N times the mean of the simulation errors. The following conclusions can be drawn: alignment should, in general, not be done with simple proportional alignment factors, but each individual gets its own alignment factor. Also, in a model with more than one endogenous variable a constraint which applies to one variable will, in general, not only imply an alignment of that particular variable but also of all other variables. Furthermore, in nonlinear models there will, in general, not exist as simple alignment factors as in the linear case.
One might also note that it is possible to have an alignment, which simulates the known total exactly. Practitioners seem to prefer this kind of alignment. The alignment matrix would then become a function of the randomly drawn simulation errors. Let the simulation error be then the new alignment matrix obtains if Yt is replaced by in the least-squares estimate of , and by . One would thus have to re-estimate the parameters for every new simulation. However, given the random specification of the model it is not clear why one would like to use the external information in this way. The model is not set up such that it should replicate the same total every time.
If the external data are estimates rather than population parameters that is a reason not to enforce an exact equality even in the mean. A natural approach to incorporate uncertain external information is that of mixed estimation, a technique, which is well developed for linear models in many textbooks, but less developed for nonlinear models.
Given the complexity and mixture of model types and functional forms in a large MSM its parameters are usually estimated in a piecemeal way, sub-model by sub-model. This is sometimes necessary because one does not always have access to one large sample including all variables, but have to use several samples collected from different sources. Nonetheless, the piece wise approach may be inappropriate. It depends on the model structure. If the model has an hierarchical or a recursive structure and if the stochastic structure impose independence or lack of correlation between model blocks or sub-models, then a piece wise approach can be justified (cf. the discussion in Klevmarken, 1997).
By way of an example consider the following simple two-equation model:
This is a recursive model, and it is well-known that OLS applied to each equation separately will give consistent estimates of , and . The estimate of gives the BLUP while predictions of y2 outside the sample range are . However, this suggests the folloing model-wide criterion:
Minimizing this criterion with respect to and yields the OLS estimator for but the following estimator for :
In this case, both the "piece wise" OLS estimator of and the "system-wide" instrumental variable estimator (8) are consistent but the OLS estimator does not minimize the prediction errors as defined by (7). In fact, under the additional assumption of normal errors the estimator (8) is a maximum likelihood estimator and thus asymptotically efficient.2
If we would add the assumption that and are correlated the recursive property of the model is lost and OLS is no longer a consistent estimator of . The estimator (8) is, however, still consistent and under the assumption of normality an ML estimator. In this example we would thus prefer the "system-wide" estimator (8) whether the model is recursive or not.
In applied micro-simulation work it might not always be necessary to insist on efficient estimators. Usually, large micro-data sets are used for estimation and then also the variance of less efficient estimators might become acceptable. In the above example we could perhaps do with the piece wise OLS estimator if the sample is large, but only if the model is recursive.
Finally, a more general comment on the choice of estimation criterion is in place. The least-squares criteria commonly used assume that we seek parameters estimates such that the mean predictions give the smallest possible prediction errors, Eq. (7) is an example. However, in micro-simulation we are not only interested in mean predictions, but we want to simulate well the whole distribution of the target variables. This difference in focus between micro-simulation and a more conventional econometric analysis might suggest a different estimation criterion. We will return to this topic in Section 4.3.
The complexity and nonlinear character of MSM and the fact that they are designed to simulate suggest that simulation-based estimation is a feasible approach to obtain system-wide estimates. A discussion of simulation-based estimation does not only lead to new estimators but also highlight the need to change the conventional estimation criteria to one, which is compatible with the simulation context. Assume the following simple model:
where xt is an exogenous variable, a random variable with known p.d.f. and an unknown parameter.
We assume that does not have a closed form.
The basic idea of estimating is to obtain a distribution of simulated y-values, with properties which as closely as possible agree with those of the p.d.f. of yt. It would appear to be a natural approach to choose such that it minimizes
However, as shown in Gouriéroux and Monfort (1996) (p. 20), this "path calibrated" estimator is not necessarily consistent. To see this, consider an example, which differs a little from the one, used in Gouriéroux and Monfort (1996). Assume the following simple model:
We seek parameter estimates and such that the model can be simulated,
From the assumptions made and the additional assumption that (1 / n) converges to a finite limit when n tends towards infinity, it follows that
Using this criterion, we thus get an inconsistent estimate of but a consistent estimate of . Essentially, this estimator tells us to ignore the random drawings of when we simulate, i.e. only to use mean predictions. As already noted, such a procedure does not agree with the objective of micro-simulation. In this particular model, the estimate of is consistent, but if there was a functional relation between and then the slope would also become inconsistently estimated. It is perhaps possible to generalize this result and suggest that if there is any functional relation between the parameters, which determine the mean path and those determining the dispersion around this path in an MSM, then one cannot use a path-calibrated estimator.
An alternative approach is to use a "moment calibrated" estimator, which minimizes the distance between observed and simulated moments. This approach does not only permit calibration to moments in cross-sectional distributions, but also to transition frequencies and intertemporal correlations, which become important in dynamic models.
Let be a vector of size p and Xt a vector of size r. Furthermore, let K(yt, Xt) be a vector function of size q, and
K could, for instance, be the identity function and the square of yt. Also, define a r x q matrix Zt = Iq qxt. From the exogeneity of xt it follows that
Because there is no closed form of k(xt, ), we will define an unbiased simulator of k,
where is a vector of s independent random errors drawn from the p.d.f. of .
A simulated GMM estimator is then obtained as
where is a r x r symmetric positive semi-definite matrix. As shown in Gouriéroux and Monfort (1996), this is a consistent estimator. The covariance matrix of the estimator has two components, one which is the covariance matrix of the ordinary GMM estimator, and one, which depends on how well k is simulated. An optimal choice of depends on the unknown distribution of yt. A simulation estimator of the optimal is given in Gouriéroux and Monfort (1996) (p. 32). Two observations are in place.
The number of moment conditions (18) invoked must be no less than the number of unknown parameters, otherwise the model becomes unidentified.
The quadratic expression in (20) can be minimized using the usual gradient-based methods if first and second-order derivatives with respect to exist. If the model includes discontinuities in one would have to rely on methods not using gradients. MSM which include tax and benefit legislation typically have discontinuities in variables, which may or may not imply discontinuities with respect to behavioral parameters.
It should be possible to include constraints of the kind discussed in a previous section in the simulation based approach. Suppose K is the identity function in yt, so the moment condition becomes
The empirical correspondence to the expression to the left of the equality sign is
Suppose now that we know the finite sample mean . How could we use this information? If we also knew the Xt values for all individuals in the finite sample, we could substitute in (22) for and extend the summation in the second term of (22) to N, and thus get an empirical correspondence to (21) for the whole finite population. In practice, this is of course not possible. One only knows the x-observations of the sample, but with known selection probabilities pt they can be used to compute the following estimate:
The covariance matrix of the resulting estimate should now have a third component, which reflects the sampling from the finite population. (A simple numerical example is provided in Klevmarken (1998).)
Lacking measures of the quality of an MSM most practitioners validate their models by analyzing how closely they track observed data and known benchmarks.
If the tax and benefit legislation has been translated into computer code with sufficient detail and care and the data are detailed and accurate enough, there is no need to validate a conventional static tax-benefit model without behavioral adjustments, because there is nothing to validate. However, if the simulation model includes behavioral adjustments, there is a validation problem. But how would one go about validating a static model? Is it at all possible? The problem with the comparative statics of a static MSM is that it does not give predictions for any specific point in time or time interval, and thus, it is hard to know to what the simulations should be compared. Suppose for instance, that a labor force participation equation is estimated from a cross-section at the end of a long period of unchanged tax and benefit systems and a stable labor market. Then a major tax reform takes place. Is it a good idea to validate the predictions from this model by comparing with observed participation rates from the first, second or third year after the reform?
Validation of a dynamic and dated model does not suffer from the same problem. In this case simulated values have a correspondence in the real world. Validation involves two major issues. First the choice of criterion and validation measure, and second the derivation of the stochastic properties of this measure taking all sources of uncertainty into account. The choice of criterion for validation is of course closely related to that for estimation. As already mentioned, in dynamic micro-simulation we are not only interested in good mean predictions, but also in good representations of cross-sectional distributions and of transitions between states. When an event occurs becomes important. A MSM is likely to have a number of simplifying assumptions about lack of correlation and independence, both between individuals and over time. For this reason, one might expect too much random noise in the simulations and too quickly decaying correlations compared to real data. In addition to model wide criteria, one might thus be interested in criteria that focus on these particular properties. Work is needed to develop such measures with known properties.
For a model not too big and complex in structure it might be feasible to derive an analytic expression for the variance-covariance matrix of the simulations, which takes all sources of uncertainty into account: random sampling, estimation and simulation errors (for an example, see Pudney and Sutherland (1996)). In general, MSMs are so complex that analytical solutions are unlikely. Given the parameter estimates the simulation uncertainty can be evaluated if simulations are replicated with new random number generator seeds for each replication. There is a trade-off between the number of replications needed and the sample size. The bigger sample the fewer replications.
To evaluate the uncertainty which arises through the parameter estimates one approach is to approximate the distribution of the estimates with a multivariate normal distribution with mean vector and covariance matrix equal to that of the estimated parameters. By repeated draws from this normal distribution and new model simulations for each draw of parameter values an estimate of the variability in the simulation due to uncertainty about the true parameter values can be obtained.
To avoid the normal approximation, one might use sample re-use methods. For instance, by boot strapping one can obtain a set of replicated estimates of the model parameters. Each replication can be used in one or more simulation runs, and the variance of these simulations will capture both the variability in parameter estimates and the variability due to simulation (model) errors. If the boot strap samples are used not only to estimate the parameters but also as replicated bases (initial conditions) for the simulations, then one would also be able to capture the random sampling errors.
Much of the total error in simulated values will come from the choice of a particular model structure. Sensitivity analysis is an approach to assess the importance of this source of error. As pointed out in Citro and Hanushek (1997) (p. 155) "sensitivity analysis is a diagnostic tool for ascertaining which parts of an overall model could have the largest impact on results and, therefore, are the most important to scrutinize for potential errors that could be reduced or eliminated". If simple measures of the impact on key variables from marginal changes in parameters and exogenous entities could be computed they would potentially become very useful.
The credibility of MSMs with the research community as well as with users will in the long run depend on the application of sound principles of inference in the estimation, testing and validation of these models. This paper has reviewed a few issues of inference in static and dynamic MSMs.
The application of a model-wide estimation criterion will, in general, suggest an estimator, which does not permit a piece wise estimation of sub-model by sub-model. Only if the model has a hierarchical or recursive structure it is possible to use a piece wise approach.
It was also suggested that the alignment procedures now used in practical work could be seen as part of the estimation procedure. It followed that one should, in general, not use simple proportional alignment. It is important to note that the constraints imposed by alignment must be tested if accepted by data. If they are not that is a clear indication that something is wrong with the model, and it should be reformulated rather than forced "on track" by alignment.
It was also suggested that the simulation approach to estimation could be useful in micro-simulation work. These models are designed to simulate, and they also frequently include nonlinear and complex relations, which suggests that simulation-based estimation has a relative advantage. However, path-calibrated estimates are, in general, inconsistent and should be avoided, in particular in a micro-simulation context which does not only focus on mean relations. A better alternative is moment-calibrated estimates.
The maximum likelihood approach was never mentioned in the discussion of estimation issues above. Usually, one has very little reason to choose a particular family of distributions a priori, and an unrealistic but convenient choice might seriously distort the simulated distributions. Please remember that this is an important issue in micro-simulation. To preserve the distributional properties of data when simulating one sometimes chooses not to draw random errors from a specified distribution like the normal, but from empirical distributions of residuals. The maximum likelihood approach then becomes impossible.
Finally, there is the issue of whether it is practically feasible to use the suggestions given above. The programs of some of the MSMs now used, in practice, need so much computer time for one single simulation that any method that needs repeated simulations would seem unpractical. However, programming code, programming skills and computers become ever faster, and more recent models tend to run much faster than old models. Depending on model structure it might also become possible to apply the approaches suggested to smaller sub-models.
MSM often include discontinuities which could imply that these regularity conditions do not hold, but models are more likely to be continuous in the behavioral parameters than in variables, the values of which are determined by legislation and government rules.
The estimator (8) is an ML estimator because there is no additional x-regressor in the second relation. The reduced form becomes a SURE system with the same explanatory variable in both equations. In general, the ML estimator will depend on the structure of the covariance matrix of the errors.
This section relies to a large extent on the book by Gouriéroux and Monfort (1996).
Micro/macro-simulation of socioeconomic population processesIBM Computing Conference.
Content, validation and uses of CORSIM 2.0, a dynamic microanalytic model of the United States. Paper presented at theIARIW Conference on Micro-simulation and Public Policy.
Microsimulation and Public PolicyAmsterdam: North-Holland.
Assessing Policies for Retirement Income. Needs for Data, Research and Models. National Research CouncilWashington, DC: National Academy Press.
Simulation-Based Econometric MethodsOxford: Oxford University Press.
Behavioral Modeling in Micro-Simulation Models. A Survey. Working Paper, No. 31Department of Economics, Uppsala University, Uppsala, Sweden.
Statistical Inference in Micro-Simulation Models: Incorporating External Information. Working Paper. No. 20Department of Economics, Uppsala University,Uppsala, Sweden.
Utvärdering Av Framskrivningstekniken i FASIT-Modellen (Evaluation of the Calibration Technique Used in the FASIT ModelMemo Statistics Sweden.
Calibration as a Standard Method for Treatment of NonresponseDepartment of Statistics (diss.). Stockholm University.
ADJUST-A Program Package for Adjustment of Micro Data by Minimum Information Loss Principle. Program Manual, FFB-Documentation No. 1University of Luneburg.
Microdata Adjustment Using the Minimum Information Loss Principle. Discussion Paper No. 10. Forschungsinstitut Freie BerufeUniversitat Luneburg.
Microsimulation and Public PolicyStatistical reliability in microsimulation models with econometrically estimated behavioral responses, Microsimulation and Public Policy, Amsterdam, Elsevier.
No specific funding for this article is reported.
This article has been previously published in Mathematics and Computers in Simulation 59 (2002) 255-265.
- Version of Record published: April 30, 2022 (version 1)
© 2022, Anders Klevmarken
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.