# Dynamic Microsimulation for Policy Analysis. Problems and Solutions

1. Department of Economics, Sweden
Research article
Cite this article as: A. Klevmarken; 2022; Dynamic Microsimulation for Policy Analysis. Problems and Solutions; International Journal of Microsimulation; 15(1); 121-134. doi: 10.34196/ijm.00256

## 1. What is micro simulation?

Micro simulation is a technique that uses the capacity of modern computers to make micro units act and interact in such a way that it is possible to aggregate to the level of interest. A micro simulation model can be seen as a set of rules, which operates on a sample of micro units such as individuals, households and firms. Each micro unit is defined and characterized by a set of properties (variables) and as the model is simulated these properties are updated for each and every micro unit. The model might simply be a set of deterministic rules such as the income tax rules of a country operating on a sample of tax payers, and used to compute the distribution of after tax income, the aggregate income tax revenue or other fiscal entities of interest. But the model could also include behavioral assumptions usually formulated as stochastic models. Examples are fertility models, models for household formation and dissolution, labor supply and mobility.

In micro simulation modeling there is no need to make assumptions about the average economic man. Although unpractical, we can in principle model every individual in the population. It is no simple task to model the behavior of single consumers and firms, but it is an advantage to model the decisions of those who actually make them and not the make believe decisions of some aggregate. It stimulates the researcher to pay attention to the institutional circumstances that constrain the behavior of consumers and firms. It also in a straightforward way suggests what data should be collected and from whom. Similarly, in a micro simulation model it is possible to include the true policy parameters and the rules which govern their use, such as tax rates, eligibility rules, tax thresholds, etc. One is not confined to using average tax rates applied to everyone. This makes micro simulation especially useful for policy analysis.

The development of micro simulation can be traced to two different sources. One is Guy Orcutt’s idea about mimicking natural experiments also in economics and his development of the behavioral dynamic micro simulation model DYNASIM (Orcutt, 1957; Orcutt et al., 1961; 1976) that later was further developed by Steven Caldwell into the CORSIM model (Caldwell, 1993; 1996). Another source is the increased interest among policy makers for distributional studies. Changes in the tax and benefit systems of many Western economies has developed a need for a tool that analyzes who will win and who will loose from changes in the tax and benefit systems. As a result many governments now have so called tax-benefit models. Examples are the Danish LAW model, the UK model POLIMOD (Redmond et al., 1996), STINMOD in Australia (Lambert et al., 1994; Schofield and Polette, 1996), SWITCH in Ireland (Callan et al., 1996), and FASIT1 in Sweden. At the European level EUROMOD is an ambitious attempt to build a tax-benefit model for all of EU (Sutherland, 1996; Sutherland, 2001). These models usually do not include behavioral relations but only most details of the tax and benefit rules. These rules are then applied to a sample of individuals for which one knows all gross incomes and everything else needed to compute taxes and benefits. For every individual in the sample one is thus able to compute the sum of all (income) taxes due and the disposable income for each household. The output becomes, for instance, the distribution of disposable income. It is then possible to change the tax rates or anything else in the tax code and run the model once again and compare to the previous result. In this way one can analyze who will gain and who will loose from a tax change, and estimate the aggregate budget effects of tax and benefit changes. The simulation model will, however, only give the first-order effect of a tax change, because household composition, work hours and incomes are assumed unchanged and not influenced by the taxes. This is both strength and weakness of the tax-benefit models. It is strength because it is easy to understand what the model does and no controversial assumptions are needed. There is also no difficult inference problem. All the analyst needs to do is to draw an inference from the random sample of taxpayers to the population of taxpayers, which is something we know from sampling theory.2 The weakness is of course that we do not know the relative size of any adjustments of behavior to the tax and benefit changes. Many tax reforms aim at changing the behavior of taxpayers. The first-order effect might then become a bad approximation. As a result attempts have been made to enlarge the tax-benefit models with behavioral models to capture these adjustments. Duncan and Weeks (1997) gives an example of a tax-benefit model amended with a labor supply model. Additional examples can be found in Table 3 in Klevmarken (1997). In this way the tax-benefit models approach Orcutt’s DYNASIM and its successors.

Large scale dynamic microsimulation models with behavioral adjustments typically include demographic models that move the population forward, models that simulate earnings and labor supply and sometimes also models for geographical mobility, demand for housing etc. Most models of this kind have been developed in academic environments and have rarely been used to advice governments on policy issues. Examples are the Swedish MICROHUS (Klevmarken and Olovsson, 1996; Klevmarken, 2001, Appendix), the Dutch NEDYMAS (Nelissen, 1994) and the German Sfb3-MSM (Helberger, 1982; Hain and Helberger, 1986; Galler and Wagner, 1986 and Galler, 1989, 1994). Among the few models of this kind that have been used for policy purposes are CORESIM and it Canadian sister model DYNACAN (Morrison, 1997), the Swedish Ministry of Finance model SESIM (Eriksson and Hussénius (1999)Flood et al., 2005 and www.sesim.org) and the microsimulation model used by the Swedish National Insurance Board (RFV) to simulate the future of the Swedish public pension system. The latter model is probably one of the oldest policy driven microsimulation models that are still in operation. It was developed in the beginning of the 1970s (Eriksen, 1973; Klevmarken, 1973).

Most of the models mentioned above are not very explicit and detailed about the path economic subjects follow to reach a decision. These models usually take the form of conditional distributions or transition matrices that only describe the outcome of the decisions taken. There is a class of microsimulation models, sometimes called “agent-based models” that, for instance, model the search behavior of agents in the market and when a transaction takes place, rather than just the distribution of transactions. The data requirements for these models are even more demanding than those of more conventional microsimulation models and as a result there are rather few models of this kind and they are experimental in character, see for instance, Eliasson (1996).

A number of conference volumes give good surveys of the microsimulation territory, for instance Orcutt et al. (1986), Harding (1996), Gupta and Kapur (2000) and Mitton et al. (2000)

## 2. Modelling for micro simulation

Most micro simulation models are built for policy analysis. One can see a micro simulation model as a laboratory in which it is possible to evaluate alternative policies in a constant environment, see for instance Rake (2000) who used a common micro simulation model to compare the properties of the pension systems of France, Germany and the UK.3 This particular focus of micro simulation raises a number of issues related to model building.

First, as already mentioned micro simulation models can relatively easily accommodate real life policy parameters when for instance, tax, eligibility and benefit rules are programmed into a model. There is no need to apply average tax rates to aggregates of individuals.

Tax rules and rules that determine who is eligible for various benefits are usually highly nonlinear and sometimes have discontinuous jumps. Micro simulation models have the advantage of relatively easily accommodating such functional forms. One is thus not confined to functions with smooth properties. Sometimes we have policy problems which require that these discontinuities are modeled at an individual level where an aggregate approach is impossible. For instance, in the Swedish ATP pension system pensions were based on the 15 best years of earnings and studies of the properties of this pension system required simulations of individual earnings profiles to determine the 15 best years. Another example is a study of the cost for old age care in the United Kingdom (see Hancock (2000)). In this case the liability for charges was related to income and wealth in a complicated non-linear way.

Some policy issues are related to the behavior of relatively small and sometimes extreme groups of the population. This is for instance the case when we analyze poverty or study how a new tax will work on various subgroups of the population. The models needed in these situations are models that simulate the whole distribution of outcomes well. It is not sufficient to simulate means or conditional means but we also have to replicate the tails of the distributions. This focus has obvious implications for model building: heterogeneity in behavior must be reflected in the model structure and the properties of residual variation must be carefully considered. But it also has implications for estimation and validation. Criterions for estimation and validation should agree with the general purpose of micro simulation and not just aim at estimating conditional means (see below).

In building micro simulation models we are thus primarily interested in models that simulate well while the interpretation of model parameters are of secondary interest. Good predictors or simulators use all available information at the time the simulation is done. Suppose, for instance, that we are interested in simulating a variable yti , t=1, 2, 3, …. for every individual i=1,…n. Assume also that we know y0i (and possibly also a longer history) for every individual. We will then need a model structure that uses the information y0i to simulate the future of each individual. Economic models are not always derived on this form. It is often the case that these models have been adapted to cross-sectional applications and implicitly assume that a new decision is taken at every new time point without being influenced by past decisions. Such models are not very useful in dynamic micro simulation. They tend to introduce too much mobility and too quickly decaying autocorrelations. Consider for instance a labor supply model. In a dynamic micro simulation context labor supply should not only be a function of the wage rate, non-labor income and the income tax system, but also of the current and possibly past labor supply, episodes of non-work etc. Most people who have a job do not change their hours very much from one year to another.

Similarly, simulated tenure choice should depend on current tenure choice and possibly also on when the family moved into their current house or apartment. A family which recently has moved into a new house has a low probability to move again. From a theoretical point of view it would seem quite natural that decisions depend on the current situation of the decision maker. One could for instance think of habits, cost associated with a change or a move and of decisions about durables covering more than one period.

Dynamic micro simulation models thus often take the form of conditional distributions, distributions that are conditioned on past history and recent changes in key variables that influence a new decision.4

Micro simulation models have sometimes been criticized because they do not have the character of structural relations but rather that of reduced forms. As a consequence one has questioned the autonomy and stability of the model structure under policy changes. This is a discussion that goes back at least to Haavelmo’s famous supplement to Econometrica (Haavelmo, 1944) in which he discussed the concepts of an autonomous model and an empirically stable model structure, a discussion that became revitalized through the Lucas’ critique. With the micro simulation focus on policy evaluation this is an important issue. If the model structure is not autonomous to policy changes (within ranges of interest) it is not possible to use the model for this purpose. It is however, not easy to know if a model is sufficiently stable to permit the analysis of a certain change in policy.

Consider the following example. Frequently we need to model the presence of a property or an event and if the individual has the property or experiences this event we have to simulate an intensity or an amount. For instance, we might like to simulate if an individual has a job or not and if he/she is working we also need the number of work hours. A common model is a two-equation selection model with a probit equation for the participation decision and a regression model capturing the number of hours. If there is correlation between the two equations, they are usually estimated jointly or by the Heckit approach to capture the endogenous selection into the group of working individuals. The estimated hours equation then applies unconditionally to the entire population. In a simulation context the two equations have to be simulated jointly. An alternative approach is to estimate the distribution of hours worked conditional on having a job. The model for the probability to get a job and the hours model can then be simulated sequentially. This second approach can be criticized because the properties of the conditional hours equation might depend on who is selected into having a job. Any policy change that influences employment may then also change the hours equation. On the other hand, only those who get or have a job have a choice of hours, while those who do not have a job might not even think about how many hours they would work had they had a job. An equally plausible model is thus to assume that it is the fact that an individual has a job that determines the decision about hours. This does not exclude that individuals choose different hours and that the composition of the group of employed will determine the total number of hours and even the dependence of hours on wage rates and nonlabor incomes. To get a model that is robust to policy changes the factors that determine heterogeneity in the responses to wage rates and income changes would then have to be included in the hours equation.

It is certainly not sufficient for autonomy to have a model that is derived through some kind of optimization such as a utility maximization and includes the “deep” parameters of a utility function. Although many economists think of preferences as something stable there is not much empirical evidence to support this notion. In particular there are good reasons to think that preferences change as people age. There are very few if any economic fix points and a humble attitude towards the autonomy of economic models seems justified. In the end it is an empirical issue to find out if a model structure is stable and not influenced by policy changes.

## 3. Data problems

Good data are needed for three different purposes: data that give individual start values of the simulations, data for estimation and data for validation. In dynamic micro simulation one would ideally have a longitudinal data set that can serve all three purposes. This is rarely the case. In practice the data set that gives the start values is usually a cross-sectional sample or at best a short panel that can also be used to estimate some of the relations needed in the micro simulation model. For this project we have been able to use the rich longitudinal register data of the LINDA data base from Statistics Sweden. Start data were taken from the 1999 wave, but amended with information from other sources.5 Quite frequently behavioural relations have to be estimated using other data sources than the start data set because the variables needed are not included (with the accompanying problems of differences in the definitions of units of analysis and the variables being used). The set of variables included in the data set that gives start values thus sets limits as to which variables one can use to explain behaviour. Not only the target variables of the micro simulation analysis need be simulated but also all explanatory variables and they also have to be assigned start values. To some extent this problem can be circumvented. An example from our work on the simulation model SESIM can explain this. LINDA does not include any health data, but health status is an important variable both in its own right and to explain various processes related to aging such as retirement and the demand for health care and social care. There was thus a need to introduce a model that imputed health status for the base year and then simulated any changes in health status as people aged. This model was estimated from an external data source but recognizing that the variables driving the health status model, such as age, gender, marital status, schooling, area of residence and if born in Sweden, had to be included in the data set of start values.

Extending a micro simulation model with new sub models to simulate both target variables and non-target explanatory variables obviously introduces a lot of noise into the simulated distributions. There is a trade off between theory driven modelling and the desire to avoid simulation errors introduced when data or theory are weak. Suppose theory suggests that the distribution of a variable y depends on another variable x, while it is difficult to formulate and estimate a good model for x because there is no good theory to explain x or because we do not have access to all the data needed to estimate such a model. Depending on the context it might then be better to simulate y unconditionally of x, or use a proxy for x if possible, than to use a poor model to simulate x. In evaluating these alternatives one again has to consider the possibility that a change in policy might change the unconditional distribution of y while the conditional distribution might be more stable (c f above).

Most micro simulation models are fixed period models, ie the time period between events is fixed to, for instance, a year. The possibility of multiple changes in individual status in this time span is usually not accounted for. Dynamic models condition on the status attained in the previous period (and possibly also in earlier periods). To estimate such models one needs annual panel data. In building the SESIM model we encountered a problem in estimating the health status model mentioned above. The data we could use had a longer time span than a year. In this particular case we had to use data with a lag of eight years. Assuming that the true model has a lag of one year and that the parameters were known, then it would be possible to simulate health status paths for eight years and compare to observed data. It is thus possible to estimate such a model from data with a lag of more than one year using simulation-based estimation, see chapter 4 in Klevmarken and Lindgren (2008)and Eklöf (2004) for a discussion.

Exploring micro data usually reveals observations that deviate from the behaviour suggested by theory or common sense. There are almost always outliers and sometimes there are groups of outliers that are so numerous that it is hard to ignore them when modelling. For instance, to simulate the tax base for the tax on income from capital we needed a function that simulated interest paid on mortgages and loans. One might think that nobody would pay interest unless one has a loan and that it would be natural to model interest paid not in kronor but as a rate on the sum of all mortgages and loans. In this way we could also avoid the problem of indexing an amount by the CPI or another index. However, it turned out that some of these rates came out with missing values and some rates were unrealistically high. Our annual data included mortgages and loans at the end of a year and the sum of all interest paid during a year. Closer inspection showed that 25% of the sample households had paid interest in a year without having any loan or mortgage at the end of the year and 10% had paid interest without having any loans or mortgages either in the beginning of the year or at the end. The probability of misreported data is very low, because they were register data originating from banks, insurance companies and brokers and not self-reported data. A reasonable explanation is that some people take up a loan and repay it in the same calendar year and thus they had no loan in the beginning nor at the end of the year, but they had to pay interest. The unrealistically high estimated rates had a similar explanation. People had taken a loan or increased their mortgage in a year but repaid most of it before the end of the year. In these cases the liability both in the beginning of the year and at the end was small relative to the interest paid. Our solution in this particular case was to stick to the idea of an interest rate function but censor the data before the model was estimated and only use observations with liabilities of at least 1000 SEK (111 euro) at the end of the year. The convenience of this solution has the price that our simulations will underestimate the interest paid a little because households with no liabilities will in the simulations never pay interest, but the amounts paid by these households are after all rather small.

## 4. Macro chocks, feed back and markets in micro simulation models

Most individual and household micro simulation models simulate the behaviour of consumers and households but have no supply side and no market clearance mechanisms. Macro entities such as the CPI, interest rates, the increase in the average wage rate, and the unemployment rate are usually fed into the micro simulation model exogenously. This is a simple solution in already rather complex models, but there are at least two problems: First, the macro assumptions must be made internally consistent and consistent with the path of the household sector simulated by the model. Second, it is usually very difficult to have an opinion about the values of these macro indicators in the future. Most analysts thus assume constant rates possibly after a few years of variation close to the base year. This convention has the disadvantage of not exposing the micro simulation model to normal business cycle variations. A more realistic approach, which still makes the macro indicators exogenous to the micro simulation model, is to generate swings in the macro indicators mimicking business cycles by, for instance, a VAR model.

More ambitious attempts to merge a micro simulation model with a general equilibrium model can be found in the literature, but merging two so complex models into one model must still be seen as experimental (one example is Cameron and Ezzeddin, 2000). One can, however, think of situations in which a market feed back is of key importance for the micro simulation analysis. Returning to our experiences of the model SESIM, when we introduced geographical mobility into the model, we also had to consider its potential impact on house prices. Housing wealth is a major share of the wealth of Swedish households and the market values of houses and apartments show considerable variation across the country. While there is not much of a trend in the real house price index for the whole country, only cyclical swings, the prices in the major metropolitan areas have increased dramatically in the last decades. If the model would simulate continued migration to these areas house prices would most certainly increase even more and at the same time shift wealth from households living in urban areas to households in the major cities, but probably also cool off the migration process unless the building of new homes in these areas could accommodate the new migration. In this case it would thus be in line with one purpose of SESIM – to simulate the future distribution of wealth for the elderly – to include the market, at least in the form of a price mechanism.6 Until this is accomplished, we have to forward house prices exogenously.

There are thus a few macro indicators fed exogenously into SESIM. In our base scenario we use observed values for the period 1999-2005 and assumed values there after. In this case we have not introduced any assumptions about business cycles but built up our macro series around a steady increase in real wages of 2% and a general increase in prices also of 2% (for additional details see chapters 3 and 12 in Klevmarken and Lindgren (2008).). In this scenario we will thus be able to study the steady state properties of the model when it is unaffected by business cycles.

In its adjustment to a steady state a dynamic microsimulation model can sometimes initially show a rather dramatic change in key variables. This can be a result of changes in exogenous macro variables. For instance, prices on the financial markets might show a high volatility and the few initially observed values might deviate much from the assumed long-term values. It can also be a result of the dynamic model structure, in particular if it is such that the simulated path is sensitive to the start values. Another possible explanation is lack of coherence between the data used to estimate the model and the start data set. It is thus not always obvious how one should look upon the first few years of simulated values, do they give a realistic picture of the economy or is the adjustment phase primarily the result of data problems?

## 5. Estimation

If the relations of a micro simulation model are estimated using large micro data sets efficiency is less of a concern than consistency. But many economic variables have non-standard distributions. They are often heavily skewed and have a high kurtosis. Sometimes a variable transformation is helpful, but it is not always easy to find one. Data might also be censored from below or from above. Variables capturing assets and other wealth items sometimes show extreme outliers. OLS-based estimation is not such a good idea in these situations because the extreme tails will tend to dominate the estimates to the extent that the model might not simulate well even in the centre of the distribution. In our work with SESIM we have sometimes preferred to work with robust estimation methods (robust regression) that weight down the extreme tails and thus limit their influence. As a consequence, residuals to the estimated relation must be simulated from the empirical distribution of the residuals from the robust regression. We have chosen to estimate the percentiles of this distribution and interpolate linearly between them. In the last percentile one might have to deviate from a linear interpolation between the 99th percentile and the maximum value in order to avoid excessively many very large draws.

Most microsimulation models have a recursive or block recursive structure for the simple reason that it is not practical and technically feasible to simulate all variables jointly. A usual sequence might for instance start the simulation with demographics, and then continue with schooling, wage rates, labour force participation and hours of work, incomes, wealth and finally taxes. All previously simulated variables may influence later simulated variables but not vice versa. If the system is recursive also in a statistical sense the relations of each block can be estimated independently of the rest of the model. If it is not, the implied stochastic dependence must be accounted for.

The recursive model structure is something the model builder will have to accept for technical reasons and because of shortage of data, even if there are no strong theoretical reasons to think that this structure represents reality. In theory one might shorten the time unit of the model to make the recursive property more realistic (cf the old discussion about interdependent systems versus recursive systems, Bentzel and Hansen, 1954 and Bentzel, 1997), but in practice data will usually not allow it.

As pointed out by the Panel of Retirement Income Modeling of the U.S. National Research Council (Citro and Hanushek, 1997) one of the major problems in microsimulation work is the shortage of good micro data. Although the supply of micro data has increased a great deal in the last 20-30 years it is still hardly possible to find one data source or one sample which will contribute all the information needed for a typical microsimulation model. In fact many model builders have found it necessary to use guestimates of model parameters and then try to calibrate the model against known benchmarks. Calibration is nothing but an attempt to tune the unknown parameters such that the model is able to simulate reasonably well the distributions of key variables. In this respect there is a similarity between microsimulation modeling and general equilibrium modeling. Both rely too often on the calibration technique. Hansen and Heckman (1996) criticized this approach because they found too little emphasis on assessing the quality of the resulting estimates. In fact the properties of the estimates are usually unknown and mutatis mutandis the same is true for the simulated entities. The calibration techniques also tend to hide a more serious problem, namely that typically calibration involves only one year’s data or a single average or total. Because this reliance on a single or just a few points of benchmark data they do not always identify a unique set of model parameter values.

However, even if all unknown parameters are estimated calibration is still used to make the model “stay on track”. Policy analysts sometimes require simulations to replicate known benchmarks such as the age and gender distribution of the population, known unemployment rates, etc. In SESIM a number of demographic variables, such as the number of deaths, immigrants, emigrants and the number of newborn children, are calibrated to agree with the official population forecasts of Statistics Sweden. Calibration to official statistics is also applied to the number of new graduates at various educational levels. Labour income is calibrated such that its rate of change agrees with the exogenously assumed rate of change. In a few cases the model is also calibrated to the expected outcome of a variable, ie the mean of a simulated variable is forced to agree with its expected value. This is done to reduce the Monte Carlo variability of the simulations. This applies for instance to the number of youths who leave their parents home, the number of women who start cohabitation, and the inflow to and outflow from disability retirement.7

If simulated distributions and statistics deviate from observed distributions and statistics more than the random properties of the model allow, this is a clear indication that data reject either the particular set of estimates and/or the model structure as such. The model then needs respecification and not alignment. However, if data do not reject the model but, it still does not replicate benchmark distributions and statistics well enough alignment is an approach to increase the precision of the simulations. Alternatively one could adjust the parameter estimates so the model reproduces the benchmarks. One can thus see alignment as estimation subject to the constraints of satisfying the benchmarks. It is, however, not immediately obvious how the parameter estimates should be adjusted. In the case of a linear model and OLS estimation it is a simple case of constrained estimation as demonstrated in Klevmarken (2002), and it is also possible to derive explicit expressions for the alignment factors if one prefers to adjust the simulated values rather than the parameters. This exercise demonstrated that efficient alignment is not equivalent to the proportional adjustment that is commonly used. It also showed that in general all variables will need alignment, not only those who are constrained to satisfy the benchmarks.

It is less obvious how alignment should be done in a nonlinear model. Given that all parameter estimates before alignment are consistent estimates, one approach is to find new estimates that deviate as little as possible from the consistent estimates but make the model satisfy the benchmarks. Write the microsimulation model on the following form,

(1) ${y}_{it}=g\left({y}_{i0},....,{y}_{it-1},{x}_{i0},....,{x}_{it};\theta \right);$

where yit is a vector of k endogenous variables for an individual i at time period t, xit a vector of exogenous variables including any unobserved random variates and θ a vector of p parameters. Let YT be the kxn matrix ${y1T,....yiT,.....,ynT}$ of simulated values for a period T outside the sample period and write the benchmark constraints on the following form,

(2)

where $YT¯$ is a matrix of benchmarks and with dimensions less than those of YT and R is a matrix function. Let the number of implied constraints be c. Typically c<p. For instance, if $YT¯$ is a vector of totals, then $R(YT)=YTJNn;$ where J is a n x 1 vector with unit elements and N the population size.8 Alignment can now be achieved by minimizing the following Lagrange expression with respect to θ,9

(3) $\left(\theta -\stackrel{^}{\theta }{\right)}^{\prime }{V}^{-1}\left(\theta -\stackrel{^}{\theta }\right)+{\lambda }^{\prime }\left({\overline{Y}}_{T}-R\left({Y}_{T}\right)\right);$

where V is the covariance matrix of $θ^$. To simplify the exposition let us reformulate the constraint as A(θ)=0, where A is a vector of c nonlinear functions. This is possible because Y is a function of θ through the function g of eq. (1) and conditional on a given set of x-vectors. The alignment criterion now becomes,

(4) $\left(\theta -\stackrel{^}{\theta }{\right)}^{\prime }{V}^{-1}\left(\theta -\stackrel{^}{\theta }\right)+{\lambda }^{\prime }A\left(\theta \right);$

The first-order conditions are,

(5) $\theta -\stackrel{^}{\theta }=-\frac{1}{2}V\frac{\mathrm{\partial }{A}^{\prime }}{\mathrm{\partial }\theta }\lambda ;$

We assume that the constraints are unique in the sense that rank ($∂A∂θ$) =c Using a first-order Taylor expansion of the constraints A(θ) around $θ^$ and inserting the first-order conditions (5) makes it possible to solve for λ,

(6) $\lambda =2{\left[\frac{\mathrm{\partial }A}{\mathrm{\partial }\theta }V\frac{\mathrm{\partial }{A}^{\prime }}{\mathrm{\partial }\theta }\right]}^{-1}A\left(\stackrel{^}{\theta }\right);$

where $∂A∂θ$ is evaluated at $θ^$. Substituting expression (6) into (5) gives the aligned parameter estimates,

(7) $\stackrel{~}{\theta }=\stackrel{^}{\theta }-V\frac{\mathrm{\partial }{A}^{\prime }}{\mathrm{\partial }\theta }{\left[\frac{\mathrm{\partial }A}{\mathrm{\partial }\theta }V\frac{\mathrm{\partial }{A}^{\prime }}{\mathrm{\partial }\theta }\right]}^{-1}A\left(\stackrel{^}{\theta }\right);$

Given that $θ^$ is a consistent estimator, that θ satisfies the constraints and that $∂A∂θ|θ$ has full rank, it follows that $θ~$ is consistent too. In practice the covariance matrix V is unknown and must be replaced by a consistent estimate. The aligned values of YT, say $Y~T$ are obtained if the aligned estimates $θ~$ are used in model (1) jointly with the same history of the lagged endogenous variables and same x:s that gave the non-aligned simulated YT.10 Due to the nonlinearity of the model and the constraints it is not possible to obtain an explicit expression for the aligned Y, but the same general conclusions as in the linear case still hold.

Before the model is aligned the constraints should be tested. A test can be obtained from a comparison of the aligned and non-aligned estimators. To simplify the notation reformulate eq, (7) on the following form,

(8) $\stackrel{~}{\theta }-\stackrel{^}{\theta }=BA\left(\stackrel{^}{\theta }\right);$

If the constraints hold the covariance matrix of this difference becomes,

(9) $E\left(\stackrel{~}{\theta }-\stackrel{^}{\theta }\right)\left(\stackrel{~}{\theta }-\stackrel{^}{\theta }{\right)}^{\prime }=BE\left[A\left(\stackrel{^}{\theta }\right)A\left(\stackrel{^}{\theta }{\right)}^{\prime }\right]{B}^{\prime }=B\left[\frac{\mathrm{\partial }A}{\mathrm{\partial }\theta }V\frac{\mathrm{\partial }{A}^{\prime }}{\mathrm{\partial }\theta }\right]{B}^{\prime }=V\frac{\mathrm{\partial }{A}^{\prime }}{\mathrm{\partial }\theta }\left[\frac{\mathrm{\partial }A}{\mathrm{\partial }\theta }V\frac{\mathrm{\partial }{A}^{\prime }}{\mathrm{\partial }\theta }{\right]}^{-1}\frac{\mathrm{\partial }A}{\mathrm{\partial }\theta }V;$

A χ2-test is obtained in the following way,

(10) $\begin{array}{c}{\chi }_{k}^{2}={\left(\stackrel{~}{\theta }-\stackrel{^}{\theta }\right)}^{\prime }{\left[V\frac{\mathrm{\partial }{A}^{\prime }}{\mathrm{\partial }\theta }{\left[\frac{\mathrm{\partial }A}{\mathrm{\partial }\theta }V\frac{\mathrm{\partial }{A}^{\prime }}{\mathrm{\partial }\theta }\right]}^{-1}\frac{\mathrm{\partial }A}{\mathrm{\partial }\theta }V\right]}^{-1}\left(\stackrel{~}{\theta }-\stackrel{^}{\theta }\right)=\\ A{\left(\stackrel{^}{\theta }\right)}^{\prime }{\left[\frac{\mathrm{\partial }A}{\mathrm{\partial }\theta }V\frac{\mathrm{\partial }{A}^{\prime }}{\mathrm{\partial }\theta }\right]}^{-1}\frac{\mathrm{\partial }A}{\mathrm{\partial }\theta }V{\left[V\frac{\mathrm{\partial }{A}^{\prime }}{\mathrm{\partial }\theta }{\left[\frac{\mathrm{\partial }A}{\mathrm{\partial }\theta }V\frac{\mathrm{\partial }{A}^{\prime }}{\mathrm{\partial }\theta }\right]}^{-1}\frac{\mathrm{\partial }A}{\mathrm{\partial }\theta }V\right]}^{-1}×\\ V\frac{\mathrm{\partial }{A}^{\prime }}{\mathrm{\partial }\theta }{\left[\frac{\mathrm{\partial }A}{\mathrm{\partial }\theta }V\frac{\mathrm{\partial }{A}^{\prime }}{\mathrm{\partial }\theta }\right]}^{-1}A\left(\stackrel{^}{\theta }\right);\end{array}$

In practice V and $∂A∂θ$ will have to be estimated using consistent estimates of θ.

If all parameters had been jointly estimated from a common data source an estimate of V had been available. In practice subsets of the parameter vector have been estimated from different data sources and thus there is no estimate of the complete covariance matrix. The best one can do is to use the information available and form a block diagonal matrix, say $V^$. The resulting new aligned estimates will still be consistent if $V^$ tends in probability to a positive definite matrix, but a χ2 distribution might not approximate the distribution of the corresponding test statistic well.

The whole approach builds on the first-order Taylor expansion of the constraints A(θ). If the model is highly non-linear a first-order approximation might not be good enough. An alternative approach then is to minimize the first term of eq. (4) with numerical methods subject to the constraints. If there are too many constraints, in particular if the number of constraints exceeds the number of parameters, a set of parameters that satisfies the constraints might not exist. In this case a solution is to reduce the number of constraints and, for instance, only align to marginal distributions. It might also be possible to make separate alignments for selected subsamples.

This approach was used to check the estimates of the imputation model for old age care. The start data set of SESIM does not include any information about old age care and we thus had to impute it. Using an external data source from a study of elderly aged 75+ and living in a parish of Stockholm a multinomial logit model was estimated for the probability to have respectively public assistance in one’s own home or to live in a dwelling/institution with professional help available all day and night (see Chapter 11 in Klevmarken and Lindgren, 2008). Because data came from such a small and rather special area and only applied to those who were at least 75 years old, while we wanted to use the model also for the age group 65-74, it was desirable to check the predictions from the estimated model against national estimates of old age care. Such estimates were obtained by age and gender from the relatively large survey HEK of Statistics Sweden.11

In this case it turned out that the first order Taylor approximation was not good enough for the multinomial logit model. Instead we have maximized the criterion function using numerical methods subject to the side constraint that the model will replicate the HINK frequencies.12 In doing so with 36 constraints (six age groups for each gender and for each combination of age group and gender three possible outcomes: no help, help at home, and 24 hours surveillance) and 12 parameters we encountered the problem that there were no parameter combination that satisfied the constraints. To avoid this problem the model was aligned for males and females separately but only to the totals of the three groups of care for the age group 65-74. For each gender there were thus only three constraints. Although the model was estimated for males and females jointly, we in this way got separate estimates of the parameters for each gender. The original estimates are compared to the aligned estimates in Table 1 and the originally simulated and the aligned frequencies of care form are compared in Table 2.

Table 1
Table 2

Klevmarken (2002) suggested that the simulated method of moments is a natural and convenient estimation method in the context of microsimulation, because the model is built as a simulator and it usually includes a complex structure nonlinear in parameters as well as in variables, and it is thus difficult to estimate with more traditional methods. Certain kinds of alignment constraints can easily be accounted for in the method of simulated moments. Suppose one of the moment conditions is,

(11) $E\left({y}_{t}-E\left(g\left({x}_{t},{\epsilon }_{t},{\theta }_{0}\right)\right)=0;$

The empirical correspondence to the expression to the left of the equality sign is

(12)

where $k~$ is an unbiased simulator of E(y). Suppose now that we know the finite population mean $Y¯$. How could we use this information? If we also knew the xt values for all individuals in the finite population, we could substitute $y¯$ in (17) for $Y¯$ and extend the summation in the second term of (17) to N, and thus get an empirical correspondence to (16) for the whole finite population. In practice this is of course not possible. One only knows the x-observations of the sample, but with known selection probabilities pt they can be used to compute the following estimate,

(13)

The covariance matrix of the resulting estimate $θ^$ should now have a third component, which reflects the sampling from the finite population.

## 6. Model validation

An important part of any model building effort is testing and validation. Validation involves two major issues. First the choice of criterion and validation measure, and second the derivation of the stochastic properties of this measure taking all sources of uncertainty into account. The choice of criterion for validation is of course closely related to that for estimation. As already mentioned, we are not only interested in good mean predictions, but also in good representations of cross-sectional distributions and of transitions between states. When an event occurs becomes important in any dynamic microsimulation exercise. A micro-simulation model is likely to have a number of simplifying assumptions about lack of correlation and independence, both between individuals and over time. For this reason one might expect more random noise in the simulations and faster decaying correlations compared to real data. In addition to model wide criteria one might thus be interested in criteria that focus on these particular properties. Work is needed to develop such measures with known properties.

For a model not to big and complex in structure it might be feasible to derive an analytic expression for the variance-covariance matrix of the simulations, which takes all sources of uncertainty into account: random sampling, estimation and simulation errors (for example, see Pudney and Sutherland (1996)). In general, micro-simulation models are so complex that analytical solutions are unlikely. Given the parameter estimates the simulation uncertainty can be evaluated if simulations are replicated with new random number generator seeds for each replication. There is a trade off between the number of replications needed and the sample size. The bigger sample the fewer replications.

To evaluate the uncertainty which arises through the parameter estimates one approach is to approximate the distribution of the estimates with a multivariate normal distribution with mean vector and covariance matrix equal to that of the estimated parameters. By repeated draws from this normal distribution and new model simulations for each draw of parameter values an estimate of the variability in the simulation due to uncertainty about the true parameter values can be obtained.

To avoid the normal approximation one might use sample re-use methods. For instance, by bootstrapping one can obtain a set of replicated estimates of the model parameters. Each replication can be used in one or more simulation runs, and the variance of these simulations will capture both the variability in parameter estimates and the variability due to simulation (model) errors. If the bootstrap samples are used not only to estimate the parameters but also as replicated bases (initial conditions) for the simulations, then one would also be able to capture the random sampling errors. In practice this advice will become difficult to follow for large models. To re-estimate all relations using many data sets is likely to become burdensome. Depending on model structure this approach could, however, be applied to sub-models or blocks of sub-models.

Much of the total error in simulated values will come from the choice of a particular model structure. Sensitivity analysis is an approach to assess the importance of this source of error. As pointed out in Citro and Hanushek (1997) (p. 155) “sensitivity analysis is a diagnostic tool for ascertaining which parts of an overall model could have the largest impact on results and therefore are the most important to scrutinize for potential errors that could be reduced or eliminated”. If simple measures of the impact on key variables from marginal changes in parameters and exogenous entities could be computed they would potentially become very useful.

## 7. End remark

The size and complexity of a typical microsimulation model makes it hard to understand its properties intuitively. This is one reason why micro simulation has received a rather cool interest by the Economics profession. Given the main tradition of working with small, stylized models and the relative failure of the large macro models of the 1960s and 1970s many economists are now skeptical about the usefulness of large models. In order to change this, micro simulation modeling has to rely on good economic theory and use sound econometric inference methods, but economist also have to learn what scientists in other disciplines already know, namely how to examine the properties of large simulation models.

Contributing to the skeptics of the Economics profession is also the view that the science of Economics has not yet given us knowledge such that it is meaningful to build large micro simulation models for policy analysis and policy advice. For instance, in their assessment of the needs for data, research and models the Panel of Retirement Income Modeling of the U.S. National Research Council concluded (Citro and Hanushek, 1997, p. 163):

“To respond to immediate policy needs, agencies should use limited, special-purpose models with the best available data and research findings to answer specific policy questions. Although such models may not provide very accurate estimates, the alternative of developing complex new individual-level microsimulation or employer models in advance of needed improvements in data and research knowledge has little prospect of producing better results and will likely represent, in the immediate future, a misuse of scarce resources.”

This was a recommendation to government agencies as policy makers concerned with retirement behavior. It should not be interpreted as general recommendation against microsimulation. On the contrary they also suggested (p. 153):

“The relevant federal agencies should consider the development of a new integrated individual-level microsimulation model for retirement-income-related policy analysis as an important long-term goal, but construction of such a model would be premature until advances are made in data, research knowledge, and computational methods.”

This evaluation is almost ten years old. Do we take a different stand today? The supply of good data has certainly increased, and the computing capacity has increased as well. Simulation is much faster today and simulation-based estimation is feasible. What about economic theory and our understanding of how society works?

## Footnotes

### 2.

However, there is sometimes a problem when the sampling frame does not apply to the current year target population. If the sample used was drawn a few years ago it is not obvious how one can use it for an inference to the current population.

### 3.

Some authors have extended the applicability of a micro simulation model even further. Caldwell and Morrison (2000) ch.9, p.216 suggest that “Just as simulation in astronomy are used to ‘observe’ processes which are difficult or impossible to observe with other methods ….., so microsimulation can be used to ‘observe’ processes and outcomes for which no respectable data exist from other sources.” This idea appears analogous to that of estimating the cells of a contingency table just knowing the marginal distributions and using minimal assumptions about the joint distribution.

### 4.

The simulation context puts constraints on the choice of model. For instance, in studying the evolution of variables such as fertility, income and wealth, models which assume specific period or birth cohort effects are sometimes used. Circumstances which as associated with a particular year or a particular birth cohort are assumed to influence the variable of interest. Models of this kind are not suitable for microsimulation because one would have to extrapolate exogenously the estimated period and birth cohort effects into the future.

### 5.

Although rich in coverage and usually of good quality Swedish register data lack complete information on who lives in the same household. Unmarried cohabiting couples without common children are considered singles and adult children living with their parents are also considered independent single households. To get around this problem and obtain a useful household concept data from another nonregister based survey of Statistics Sweden were used to pair some of the singles in the 1999 wave of LINDA into new households, see Chapter 3 in Klevmarken and Lindgren (2008).

### 6.

A simple but useful solution to an analogous problem was recently suggested in Creedy and Kalb (2005).

### 7.

In SESIM the calibration is achieved using one of three different approaches. Let $π^i$ be the estimated probability of an event, and let ui be a random draw from a standard uniform distribution. Assume that the calibration benchmark is T, i e the simulated process must result in T cases. Then in “uniform calibration” the T cases with the smallest differences $ui−π^i$ are selected, while in “logistic calibration” the T cases with the smallest differences $logit(ui)−logit(π^i)$ are selected. A third approach is just to rescale $π^i$ to give the desired expected number of cases. In the latter case there is no reduction in simulation variability.

### 8.

It is assumed that the sample was obtained with simple random sampling.

### 9.

The quadratic distance function is a natural choice, but alternatives are possible. In a different context, calibration estimators in survey sampling, Deville and Särndal (1992) gave examples of several useful distance functions.

### 10.

Note that this implies that the same realisations of all random variables have to be used. This follows from the assumption that it is the simulated values using $θ^$, a single realisation of the model, that is aligned. Although this is what is most often done in practice, an alternative approach is to constrain the expected value of the simulated realizations to satisfy the constraint.

### 11.

We couldn’t use the HEK survey to estimate the model because HEK does not have ADL measures, which are good predictors of old age care.

### 12.

The GAUSS program CO was used.

## References

1. 1
Ekonomporträttet: Herman Wold 1908-1992
(1997)
Ekonomisk Debatt, Årg 25:473–479.
2. 2
On Recursiveness and Interdependency in Economic Models (1954)
The Review of Economic Studies 22:153.
https://doi.org/10.2307/2295873
3. 3
Content, validation and uses of CORSIM 2.0, a dynamic microanalytic model of the United States
(1993)
IARIW conference on Micro-simulation and Public Policy.
4. 4
Microsimulation and Public Policy
(1996)
Health, Wealth, Pensions and Life Paths: The CORSIM dynamic Microsimulation Model, Microsimulation and Public Policy, North-Holland.
5. 5
Microsimulation in the New Millennium: Challenges and Innovations, Chapter 9
(2000)
Validation of longitudinal dynamic microsimulation models: Experience with CORSIM and DYNACAN, Microsimulation in the New Millennium: Challenges and Innovations, Chapter 9, Cambridge, U.K, Cambridge University Press.
6. 6
Simulating Welfare and Income Tax Changes: The ESRI Tax-Benefit Model, ESRI
(1996)
Simulating Welfare and Income Tax Changes: The ESRI Tax-Benefit Model, ESRI, Dublin.
7. 7
Microsimulation Modelling for Policy Analysis. Challenges and Innovations
(2000)
Assessing the direct and indirect effects of social policy: integrating input-output and tax microsimulation models at Statistics Canada, Microsimulation Modelling for Policy Analysis. Challenges and Innovations, Cambridge UK, Cambridge University Press.
8. 8
Assessing Policies for Retirement Income. Needs for Data, Research, and Models
(editors) (1997)
Washington, D.C: National Research Council, National Academy Press.
9. 9
Evaluating Policy Reforms in Bahaviour Tax Microsimulation Models
(2005)
34th meeting with the Economic Society of Australia.
10. 10
Calibration Estimators in Survey Sampling (1992)
Journal of the American Statistical Association 87:376–382.
https://doi.org/10.1080/01621459.1992.10475217
11. 11
Behavioral tax microsimulation with finite hours choices (1997)
European Economic Review 41:619–626.
https://doi.org/10.1016/S0014-2921(97)00005-6
12. 12
Estimation of a Dynamic Ordered Probit Model with Time Gaps Within Observations
(2004)
Sweden: Mimeo Department of Economics, Uppsala University.
13. 13
Microsimulation and Public Policy
(1996)
Endogenous Economic Growth through Selection, Microsimulation and Public Policy, Amsterdam, Elsevier Science Publishers.
14. 14
En Prognosmodell För Den Allmänna Tilläggspensioneringen
(1973)
Stockholm: Riksförsäkringsverket.
15. 15
SESIM – A Short Documentation
(1999)
Stockholm: Ministry of Finance.
16. 16
www.sesim.org (2005)
SESIM III – a Swedish dynamic micro simulation model.
17. 17
Policy evaluation by microsimulation - the Frankfurt model
(1989)
21st General Conference of the International Association for Research in Income and Wealth.
18. 18
Mikroanalytischer Grundlagen Der Gesellschaftspolitik
(1994)
369–379, Mikrosimulationsmodelle in der Forschungsstrategie des Sonderforschungsbereich 3, Mikroanalytischer Grundlagen Der Gesellschaftspolitik, Vol, 2, Berlin, Akademie Verlag, p.
19. 19
Microanalytic Simulation Models to Support Social and Financial Policy
(1986)
227–247, The microsimulation model of the Sfb 3 for the analysis of economic and social policies, Microanalytic Simulation Models to Support Social and Financial Policy, Amsterdam, North-Holland, p.
20. 20
Microsimulation in Government Policy and Forecasting, Contributions to Economic Analysis 247
(2000)
Elsevier.
21. 21
The Probability Approach in Econometrics (1944)
Econometrica 12:iii.
https://doi.org/10.2307/1906935
22. 22
Microanalytic Simulation Models to Support Social and Financial Policy
(1986)
Longitudinal microsimulation of life income, Microanalytic Simulation Models to Support Social and Financial Policy, Amsterdam, North-Holland.
23. 23
Microsimulation Modelling for Policy Analysis
(2000)
Charging for care in later life: an exercise in dynamic microsimulation, Microsimulation Modelling for Policy Analysis, Cambridge, Cambridge University Press.
24. 24
The Empirical foundations of Calibration (1996)
Journal of Economic Perspectives 10:87–104.
https://doi.org/10.1257/jep.10.1.87
25. 25
Microsimulation and Public Policy
(1996)
Amsterdam: North-Holland, Elsevier Science B.V.
26. 26
Auswirkungen Öffentlicher Bildungsausgaben in Der BRD Auf Die Einkomensverteilung Der Ausbildungsgeneration
(1982)
Stuttgart: Gutachten im Auftrag der Transfer-Enquete-Kommission, Kohlhammer.
27. 27
Simulating an Ageing Population. A Microsimulation Approach to Applied to Sweden, Contributions to Economic Analysis No 285
(2008)
Bingley, U.K: Emerald Group Publishing Lmtd.
28. 28
En ny modell för ATP-systemet, Statistisk Tidskrift
(1973)
Statistical Review 1973:403–443.
29. 29
Behavioral Modeling in Micro Simulation Models. A Survey
(1997)
Sweden: Department of Economics, Uppsala University.
30. 30
Microsimulation – A Tool for Economic Analysis
(2001)
Sweden: Department of Economics, Uppsala University.
31. 31
Statistical inference in Micro Simulation Models: Incorporating External Information (2002)
Mathematics and Computers in Simulation 59:255–265.
https://doi.org/10.1016/S0378-4754(01)00413-X
32. 32
Microsimulation and Public Policy
(1996)
Direct and behavioral effects of income tax changes - simulations with the Swedish model MICROHUS, Microsimulation and Public Policy, Amsterdam, Elsevier Science Publishers.
33. 33
An Introduction to STINMOD: A Static Microsimulation Model
(1994)
Australia: University of Canberra, Australia.
34. 34
Microsimulation Modelling for Policy Analysis
(2000)
Cambridge: Cambridge University Press.
35. 35
DYNACAN, The Canadian Pension Plan Policy Model: Demographic and Earnings Components
(1997)
Proceedings of the Microsimulation Section at the International Conference on Information Theory, Combinatorics, and Statistics.
36. 36
Towards a Payable Pension System. Costs and Redistributive Impact of the Current Dutch Pension System and Three Alternatives
(1994)
The Netherlands: TISSER, Tilburg Institite for Social Security Research, Department of Social Security Studies.
37. 37
A new type of socio-economic system (1957)
The Review of Economics and Statistics 39:116.
https://doi.org/10.2307/1928528
38. 38
Policy Explorations Through Microanalytic Simulation
(1976)
Washington D.C: The Urban Institute.
39. 39
Microanalysis of Socioeconomic Systems: A Simulation Study
(1961)
New York: Harper and Row.
40. 40
Micronalystic Simulation Models to Support Social and Financial Policy
(editors) (1986)
Amsterdam: North-Holland, Elsevier Science Publishers B.V.
41. 41
Microsimulation and Public Policy
(1996)
Amsterdam: North-Holland Elsevier.
42. 42
Microsimulation in the New Millennium, Changes and Innovations
(2000)
Can we do better comparative research using microsimulation models? Lessons from the micro analysis of pension systems, Microsimulation in the New Millennium, Changes and Innovations, Cambridge, UK, Cambridge University Press.
43. 43
Microsimulation Unit MU/RN/19
(1996)
POLIMOD: An outline, Microsimulation Unit MU/RN/19, 2nd edition, Cambridge, DAE, University of Cambridge.
44. 44
A Comparison of Data Merging Methodologies for Extending A Microsimulation Model
(1996)
Australia: University of Canberra.
45. 45
Microsimulation Unit MU/RN/20
(1996)
EUROMOD: A European Benfit-tax Model, Microsimulation Unit MU/RN/20, Cambridge, DAE, University of Cambridge.
46. 46
EUROMOD Working Paper No EM9/01
(2001)
EUROMOD: An Integrated European Benefit-tax Model Final Report, EUROMOD Working Paper No EM9/01, Cambridge, DAE University of Cambridge.

## Article and author information

### Author details

1. #### Anders Klevmarken

Department of Economics, Uppsala, Sweden
##### For correspondence
anders@klevmarken.nu
##### Competing interests
No competing interests reported

### Acknowledgements

This article has been previously published as Chapter 2 in A. Klevmarken and B. Lindgren Simulating an Ageing Population. A microsimulation approach applied to Sweden, Contributions to economic analysis no 285, Emerald Group Publishing Lmtd, Bingley, U.K. 2008.

### Publication history

1. Version of Record published: April 30, 2022 (version 1)