# Spatial microsimulation: Developments and potential future directions

1. Institute for Governance and Policy Analysis, University of Canberra, Australia
Research article
Cite this article as: R. Tanton; 2018; Spatial microsimulation: Developments and potential future directions; International Journal of Microsimulation; 11(1); 143-161. doi: 10.34196/ijm.00176

## Abstract

This paper summarises some of the latest developments in methods to estimate and validate spatial microsimulation models. The paper also attempts to identify where the potential is for new areas of development in spatial microsimulation models, based on the author’s reading of the spatial microsimulation landscape in 2018. The methods outlined in this paper are identified as significant developments in the field, and include a number of new methods for calculating or adding indicators to a spatial microsimulation model; as well as new methods of validation and estimating confidence intervals. Potential new areas of research include further development of methods for calculating confidence intervals; work on getting spatial microsimulation into the mainstream of policy analysis; work on linking models to provide input into managing complex problems in society; and work on using big data in spatial microsimulation models.

## 1. Introduction

Spatial microsimulation adds a spatial element to traditional microsimulation models, as the spatial element is important for policy analysis. In particular, indicators like housing stress and incomes are highly spatially clustered (Tanton, Vidyattama, & Mohanty, 2015). Using spatial microsimulation to derive estimates of these indicators for small areas provides a more targeted analysis compared to using national indicators. It is also important to view results at a spatial scale, as the public, policy makers and politicians are interested in what is happening in their electorate, or in their suburb.

It could be argued that the idea of spatial microsimulation was developed around the same time that Orcutt developed microsimulation (Orcutt, 1957), as H4gerstrand (1957) developed the first geographical application of microsimulation to study internal migration in central Sweden. This was further developed by H4gerstrand (1967) which used microsimulation techniques to study the spatial diffusion of innovation. Nearly a decade later, Wilson and Pownall (1976) developed a spatial modelling framework to represent the urban system based on the micro level interdependence of households and individuals.

Much of the early development of spatial microsimulation was at the Department of Geography, University of Leeds (see, for example, Clarke & Wilson (1985). There was then significant activity in the field in the 1990’s (Ballas, Clarke, & Turton, 1999; Clarke, 1996; Clarke, Kashti, McDonald, & Williamson, 1997; Williamson, Birkin, & Rees, 1998). Reviews of methods on the development of spatial microsimulation are provided in a number of published papers (Hermes & Poulsen, 2012; O’Donoghue, Morrissey, Lennon, 2014; Tanton, 2014; Tanton & Clarke, 2014). This paper aims to update these earlier summary papers with some of the latest developments in spatial microsimulation over the last few years; but also looks at areas where spatial microsimulation may develop in future. I want to look at where spatial microsimulation may head in the next ten years. This second aim is very much my view, based on a broader reading of the landscape.

In terms of methodological developments, this paper considers two aspects: the method for developing a synthetic population for a microsimulation model, and recent developments in validation, including calculating confidence intervals. These are important in determining the reliability of any model.

In terms of where spatial microsimulation may go in the future, I identify four areas:

• Estimation of variability: this is a very new area, and more work needs to be done in deriving an approach that is stable and can be used across all models, like the replicate methods or Monte Carlo methods used to derive confidence intervals in national microsimulation models.

• Linked models: there has been much recent work on linking computable general equilibrium (CGE) and microsimulation models, and some work on linking CGE to spatial microsimulation models, but the next step is to bring together a range of models from different disciplines to provide input into different policy options for resolving complex problems.

• Making results from spatial microsimulation models more palatable to Government policy makers and the public: spatial microsimulation has not had the same take up in Government as tax/transfer microsimulation. This is changing, for example Stats Canada now uses a spatial microsimulation model for demographic forecasting (Statistics Canada, 2010a, 2011a), but there is still work to be done in this area in terms of convincing policy makers and politicians that spatial microsimulation is a useful policy tool.

• Use of big data: this includes data from smart electricity meters, smart card transport data, and data from Internet of Things (IoT) devices. Smart cities use data from transport networks, online devices like environment monitoring, identifying parking spaces, etc. These large amounts of data are unit level, and normally spatial, so are perfect for use in a spatial microsimulation model. Smart energy meters are already being used in multilevel modelling (Anderson, Lin, Newing, Bahaj, & James, 2017), so extending this to combine big data for predictive and “what if” modelling with a spatial microsimulation model would be a next logical step.

The rest of this paper is structured with Section 2 on development of methods; Section 3 on potential future directions of spatial microsimulation; and conclusions (Section 4).

## 2. Development of methods

Since 2014 when the last summary papers were written (O’Donoghue, Morrissey, Lennon, 2014; Tanton, 2014; Tanton & Clarke, 2014), some new methods for conducting spatial microsimulation have come up in the literature. These include a penalised maximum entropy approach (Rose & Nagle, 2017), a multilevel approach (Fenton, 2016), an approach called Fitness Based Synthesis (FBS) developed by Ma and Srinivasan (Ma & Srinivasan, 2015a); and an approach that is a rediscovery and development of previous work (Lymer, Brown, Harding, & Yap, 2009) using a standard technique to develop the synthetic population, but then imputing new variables (Namazi-Rad, Tanton, Steel, Mokhtarian, & Das, 2017; Philips, Clarke, & Watling, 2017a).

New methods of validation have also come up in the literature since 2014, with one applying the Bland-Altman test used in the health field (Timmins & Edwards, 2016a) and others looking at how to model confidence intervals and measures of variability in a spatial microsimulation model (Rahman, 2017; Tanton, 2015; Whitworth, Carter, Ballas, & Moon, 2017).

### 2.1 New methods of calculating a synthetic population

One of the methods in spatial microsimulation is Iterative Proportional Fitting (IPF), where initial weights are calculated based on one benchmark table (the marginal totals to match), and then adjusted based on a second benchmark table; a third table, and so on until the last table is reached, when the procedure returns to the first table and adjusts to these benchmarks again. The process iterates until the weights sum to all of the benchmark tables to some level of accuracy. The problem with IPF is that a measure of the quality of the final estimates is not available; and the procedure can struggle with sparse populations, so areas where there are not many people in a particular cell in the benchmark table.

Rose & Nagle (2017) describe a penalised maximum entropy (P-MEDM) approach to developing a synthetic population. This approach is similar to IPF, but instead of exactly summing the weights to a known marginal total, an element of error is incorporated into the calculation. This reduces overfitting problems with IPF when using sparse populations, and by introducing a measure of uncertainty, a measure of quality can be produced. The P-MEDM approach also requires knowledge of the variance in the benchmark tables to estimate this error. Rose & Nagle (2017) fit the P-MEDM model to estimate infant mortality for Bangladesh, using the Bangladesh Demographic and Health Survey (BDHS) and Census data for the benchmark tables. The final benchmarks chosen are school attendance, literacy, employment, source of drinking water, electricity connection, housing tenancy, average size of household, rural/urban and administrative division.

Internal validation results from the P-MEDM model are encouraging, however results from the validation to an external survey are mixed. The authors consider that some of this may be due to the external survey being very different to the survey used in the spatial microsimulation method, and raise concerns about the accuracy of both surveys. Their overall conclusion is that it is possible to produce accurate small area estimates using the P-MEDM approach, under the important provision that the analyst understands the limitations of the data and sampling method used in the survey. The method is likely to fail when producing any estimate of a small or rare population that is not adequately sampled in the survey; so like IPF, it still struggles with sparse populations. Based on their results and intuition, it can be concluded that when the method fails, it can fail dramatically.

Another method that has come out in recent years is multilevel IPF. The IPF routine normally fits to person or household level benchmarks, but not to both person and household, at the same time. One way to overcome this limitation is to use a multilevel IPF (Fenton, 2016), which incorporates both person and household benchmarks using a multilevel statistical model. Fenton compares an IPF using a household only (single level) model; a multilevel model; and then a third refinement to IPF which uses a survey of personal incomes (SPI) for local areas to adjust the start weight for the IPF so that it reflects the relative probability of an adult with that income being selected from the distribution of incomes in the local area.

The multilevel method uses adult and household benchmark data. For each IPF iteration, the adult benchmark totals are first applied. The arithmetic mean of these weights is then used as the starting weight to fit the household-level benchmarks. These weights are then applied to all adult household members, and the adult benchmarks are re-applied. This continues for the desired number of iterations, finally fitting and producing a set of household weights.

The method that incorporates the SPI uses the SPI weights to adjust the start weights for the IPF, so that the start weights are based on the SPI survey weights, reflecting the relative probability of that household being selected in the sample.

Fenton then uses the three methods to derive a number of indicators of multiple deprivation, including poverty rates. The multilevel model gives better estimates of incomes at the lower end of the income scale compared to the standard IPF method, whereas incorporating the SPI data to provide better start weights results in more accurate incomes at the top end of the income distribution. However, this also means that these top incomes are subject to considerable uncertainty, so care is needed in using this method.

The final method identified in the literature is the FBS approach. This approach can either start with a random sample of people (with replacement) from a survey who are assigned to specific areas using some basic benchmarks like age, sex, etc.; or can start with a null population in each area, using the FBS approach to add households. The approach calculates two fitness values for each household in the seed data (the survey being used to create the synthetic population), one for if adding the household into the synthetic population would reduce the error; and one for if removing it would reduce the error. Once the fitness values have been calculated, one household is randomly selected from those that have positive FBS values and this household is added or removed from the synthetic population. The fitness values are then recalculated; and the process continues. The process iterates until no households have positive values for either of the two fitness values (Ma & Srinivasan, 2015b). This process has also been implemented to create a synthetic population from a weighted population (Esteban Munoz, 2016), calculated using the GREGWT procedure, a reweighting program which uses a generalised regression model (Tanton, Vidyattama, Nepal, & McNamara, 2011).

Comparing results from this process applied to areas in Florida with results from an IPF method, the authors find that the FBS approach provides more accurate populations for more areas compared to the IPF approach and can derive more accurate results using fewer control tables. Further, increasing the number of control tables has little impact on the number of iterations required by the FBS approach.

Finally, imputation methods are increasingly used in microsimulation models like tax/transfer models and health models (Schofield et al., 2017). In spatial microsimulation models, they have been used by Lymer et al. (2009), and more recently by (Namazi-Rad et al., 2017; Philips, Clarke, & Watling, 2017b). Imputation methods create a synthetic population, and then add data to this population from another dataset (usually a survey) using either a Monte Carlo approach (Philips et al., 2017b), a regression approach (Lymer et al., 2009) or a matching approach (Namazi-Rad et al., 2017). The Monte Carlo approach starts using simulated annealing to develop a synthetic population, and then performs Monte Carlo sampling using conditional probabilities to add missing attributes. The regression approach uses a new dataset to calculate probabilities for a specific characteristic and then applies these probabilities to people in the synthetic population to calculate a probability that the person/household has that characteristic; and the matching approach brings data from another survey onto the synthetic population by finding similar people in another dataset and matching them to people in the synthetic population.

### 2.2 New methods of validation

In terms of validation, research has gone into two directions: developing new methods of validation; and calculating confidence intervals.

One new method of validation is the application of the Bland-Altman (BA) test, popular in health, to spatial microsimulation. This is a classic example where cross fertilisation of methods across disciplines has benefited spatial microsimulation. Because spatial microsimulation is a method that can be applied to a number of disciplines, there is a high degree of sharing different approaches across disciplines, which benefit everyone involved in spatial microsimulation modelling.

The Bland-Altman test (Bland & Altman, 1986, 1999) is a graphical method to compare two measurements techniques. In this method the differences x-y (or alternatively the ratios x/y) between the two techniques are plotted against the means of the two techniques, (x+y)/2. The test evaluates whether the differences are statistically different from 0, or the ratios statistically different from 1. In an application to spatial microsimulation (Timmins & Edwards, 2016b), x is the model result for an area; and y is the benchmarked result that the model is trying to match. The plot gives an idea of area level variation as well as total variation. The authors identify a number of benefits using the BA test:

• it provides information on area error as well as total error;

• it identifies any bias and the direction of the bias;

• it reveals outliers;

• it can handle empty cell counts;

• it shows the distribution of the error;

• it is possible to compare plots across models with different scales;

• it is easy to calculate and interpret.

The BA plots show more information than the scatter plots normally used to show error in a model. Timmins & Edwards (2016) show that the scatter plots from a model can suggest excellent results, while the BA plots show up areas with some issues which the scatter plots had not identified.

There has also been significant work on estimating measures of confidence for spatial microsimulation models over the last few years. Confidence intervals describe the uncertainty that exists in the model. They are important for policy makers as they can provide some idea of the range of values that might be expected in a real world policy change. In a previous paper, a number of experts highlighted estimates of confidence intervals as one of the needs for spatial microsimulation (Tanton, Williamson, & Harding, 2014).

Recent work on estimating measures of variability include a method which runs the model a number of times using the replicate weights from the base survey to derive an estimate of variability (Tanton, 2015), calculating Z scores using the results from the model and the benchmark data (Rahman, 2017b), and using the confidence intervals derived from a regression model used in the spatial microsimulation model (Whitworth et al., 2017).

The first method (Tanton, 2015) involves running the model on a number of sub-samples from the original survey. The Australian Bureau of Statistics (ABS) provides a set of replicate weights on their Confidentialised Unit Record Files (CURF) which is used in the spatial microsimulation model. These are calculated by the ABS based on a Jackknife replicate method using a delete one group Jackknife (Australian Bureau of Statistics, 2006). This means that on the CURF provided by the ABS, there are 60 different versions of the survey weights, each calculated using a different sub-sample of respondents. The respondents which were removed for the sub-sample have a weight of 0 for that particular replicate weight. The procedure runs the spatial microsimulation model on each of the 60 sub-samples, by removing those observations with a weight of 0. This provides some variability around the original run using the original survey weight, and this variability is then used to calculate confidence intervals.

While this procedure is reasonably easy to implement by creating a loop to rerun the model across different sub-samples, the GREGWT model is estimated for 1,300 areas across Australia, and takes about nine hours to run. Using the replicate method, this would scale up to 61 times 9 hours (60 replicate weights and one original survey weight), or 549 hours, or nearly 23 days. The GREGWT method also provides a weight for all 1,300 areas, so there would be 79,300 columns (1,300 areas times (60 replicate weights + 1 sample weight)) in the final dataset at a minimum. So the method is impractical.

In testing the method for ten random areas in Australia (to reduce the time taken and the size of the final file), the author found that the confidence intervals were very small. This was because the method was measuring the model variability and Australian sample variability, not small area sample variability. The model variability is low because GREGWT is a deterministic model and gets very similar results given similar input data; and the sample variability is low because the sample was a large Australia wide sample designed by the ABS to be reliable. So in the end, the practical issues, and the final results (very low confidence intervals), suggested that the method was not feasible for the GREGWT model.

The second method (see Rahman, 2017) calculates Z scores based on the modelled estimates from a GREGWT model and the Census benchmark variables. The Z score is calculated as:

(1) ${Z}_{ij}=\frac{{\stackrel{^}{P}}_{ij}-{P}_{ij}}{\sqrt{\frac{{P}_{ij}\left(1-{P}_{ij}\right)}{\sum _{j}{n}_{ij}}}}$

Where Zij is the Z score for the i-th small area and j-th benchmark category; Pij is the true estimate for the i-th area and j-th benchmark category from the benchmark table; $Pij^$ is the estimate from the model for area i and benchmark category j; and nij is the population of the i-th area and j-th benchmark category.

An application to housing stress in Australia using different housing tenures showed that the Z statistics were close to 0 for all households and private renter households; but there was low precision for public renters.

The final method was recently published by UK researchers using an IPF model (Whitworth et al., 2017). This method uses the results from a multilevel model used for benchmark selection in an IPF setting. For spatial microsimulation models, benchmarks that are associated with the final output variable are important for accurate results. A regression model can be used to select the best benchmarks. Whitworth et al. (2017) suggest that this can be a multilevel model to incorporate area level differences, as well as person and household benchmarks.

To estimate the confidence interval around the point estimate from the IPF, 10,000 values are drawn randomly from the known distribution of the residual between-area error term with mean of zero, standard deviation as estimated by the multilevel model, and normally distributed. The point estimate and the 10,000 separate between-area error terms are then expressed as log odds, and the error term was added to the point estimate to produce 10,000 plausible small area estimates. These estimates are then converted from predicted log odds into predicted probabilities and 95% confidence intervals are calculated.

The authors find that the confidence intervals calculated using this method appear reasonable, suggesting that the method could be applied to any spatial microsimulation that uses benchmark tables to provide point estimates.

### 2.3 Summary

This section has summarised developments in spatial microsimulation methods since 2014: incorporating errors into IPF (P-MEDM); using a multilevel approach to IPF; and applying fitness based sampling to IPF and generalised regression reweighting method. Most of these developments have been shown to provide better estimates than the basic method (either IPF or a generalised regression reweighting method).

In terms of validation, Bland-Altman plots provide more information than scatter plots of modelled vs. actual results, and are a nice addition to the validation toolkit. A reliable, workable and fully tested method of calculating confidence intervals is still not quite there. The method developed by Whitworth et al. (2017) is the most promising, as it develops confidence intervals around the variable being estimated, rather than a variable in the benchmark tables (as the Z test does), however it is very new, and needs to be tested in other situations. It also relies on a multilevel model of the benchmarks, as described above, and presumably this must also be an accurate model. The Z test, suggested by Rahman (2017), calculates estimates around the benchmark values. This is of limited use as what is required are estimates of uncertainty around the estimated value, for which there are no benchmarks and therefore no Z statistic. The method suggested by Tanton (2015) using replicate weights proved unworkable for small area estimation due to the amount of time it takes to run, and the size of the final file.

## 3. Where to for spatial microsimulation

This section provides my own views, from the reading of the literature and the spatial microsimulation landscape, of where the future work in spatial microsimulation lies. These views were developed in discussion with others in the field and are not prioritised in any way. I have identified four areas which are purely my reflection on what I think will be important for spatial microsimulation in the future. Broadly, the four areas I have identified are:

• further development of estimates of variability;

• making results from spatial microsimulation models more palatable to Government policy makers and the public; and

• modelling using big data, for example, from smart transport cards or IoT devices.

### 3.1 Further development of estimates of variability

While the last section identified a number of developments in estimating variability in spatial microsimulation models, none have provided a reliable, tested method, readily available to researchers using national microsimulation models, for example, tax/transfer models (Cohen, 1991; Creedy, Kalb, & Kew, 2007).

This area is essential for spatial microsimulation to gain acceptance from policy makers and other academics. Spatial microsimulation is becoming more common in policy analysis (Ballas & Clarke, 2001; Hynes, Morrissey, O’Donoghue, & Clarke, 2009; Tanton, Vidyattama, McNamara, Vu, & Harding, 2009) but for spatial microsimulation models to be accepted for modelling policy in Government, there needs to be some estimate of the potential range of values (confidence intervals), rather than point estimates. These confidence intervals provide the policy makers with some idea of the high/low impact of the policy change.

Many problems seen in the world today require input from different disciplines, and have no one solution – there may be a range of solutions. These problems are sometimes called “wicked problems” (Head, 2008), and call for an integrated approach (Pearson, Norman, O’Brien, & Tanton, 2017). Examples of wicked problems are climate change, social injustice, entrenched disadvantage and healthcare. Wicked problems require social, economic and environmental approaches to be combined in order to provide information to policy makers and politicians on policies that may assist in resolving these problems, rather than offering isolated solutions to them. By providing a single numeric result, the current range of spatial microsimulation models encourage the user to say that this is what will happen when this policy is introduced – for example, poverty rates will go down in one area and up in another area. But these results typically look at changes in income with no behavioural change, no social change, and no environmental change. For example, what happens if because incomes are higher in an area due to a policy change, cars are used more often, and CO2 emissions increase? I would argue that the next range of spatial microsimulation models needs to start linking different models together, so that impacts across different areas can be estimated.

This work has started, but is mainly linking economic models so far (Hérault, 2010; Rao, Tanton, & Vidyattama, 2015). Recent work by a number of groups in Australia has started looking at how a synthetic population can be used to integrate a number of different models (Tanton, Perez, & Pettit, 2017; Tanton & Vidyattama, 2018), and there is significant progress that needs to be done in this area.

### 3.3 Making results from spatial microsimulation models more palatable to government policy makers and the public

Tax/Transfer microsimulation models have been part of the Government policy landscape for a number of years and have been used extensively, for example in modelling budget policies (Berthier & Hudson, 2017; Stevenson, Ledda, Pineda, Smith, & Kluth, 2017) or for modelling in the Commonwealth’s Inter-Generational Reports in Australia (Commonwealth Treasury, 2007). However, while spatial microsimulation models have been used to model the spatial impacts of Government policy (Harding, Vu, Rodgers, Tanton, & Vidyattama, 2009; Tanton et al., 2009), they have not gained the same acceptance and use by public servants and politicians in Government as tax/transfer microsimulation models. An exception to this is Canada, where spatial microsimulation is used for demographic projections by the national statistical office (Statistics Canada, 2010b, 2011b).

In Australia, we are now finding increasing acceptance of the results from spatial microsimulation models being used to inform policy, although confidentiality restrictions mean that many of these models cannot be published immediately, instead having to wait until a policy is finalised and implemented before publications can be produced.

There may be many reasons for this reticence to use spatial microsimulation models to inform policy. One may be the issue of not being able to produce confidence intervals, already raised in this paper. For a recent model developed by NATSEM for an Australian Government department that is still subject to a Cabinet in Confidence clause, sensitivity analysis of the input parameters and variables was required, due to the fact that confidence intervals could not be produced. This sensitivity analysis applied a simple 10% increase and decrease to each parameter to identify the potential impact on the final results if the parameter changed.

Another factor that may put public servants and politicians off spatial microsimulation models is their complexity. In Australia, the tax/transfer models used to model the impact of the budget are static, rather than behavioural or dynamic, because static models are easier to explain and understand, and rely on fewer assumptions. Once labour force decisions (for example) are brought into a model, complexity, and the number of assumptions increase, making confidence intervals larger. These behavioural microsimulation models are not used as much by policy advisers and politicians, although the Productivity Commission with the Australian Treasury is developing a behavioural microsimulation model called CAPITA-B (Marshall, 2016).

Another point that may put policy makers off spatial microsimulation models is an attitude among some academic modellers that the results from their models are the only thing that should be considered in the policy process, so the idea of evidence based policy with the results from the microsimulation model being the only evidence. This is not limited to spatial microsimulation modellers, nor to microsimulation modellers, and the view is not held by all modellers. It is also a generalisation made by the author, who has crossed from being a policy analyst to being a modeller in the University system. Modellers need to recognise that the political process is not just about evidence from one source, but evidence from a number of sources, budget restrictions, voter preferences, and political negotiation. At the end of the day, the results from our models feed into all this, but as the previous section on wicked problems has made clear, it is more complex than just using the results from one model. Mike Batty, a prominent spatial scientist in the UK who has been involved in modelling for cities since the late 60’s, puts this succinctly (and proves that the previous comment is a generalisation) when he writes:

“The purpose of our science is to inform the dialogue, not to generate “answers” or “solutions” per se, notwithstanding the fact that we represent the argument in these terms.” (Batty, 2013, p. 360).

As modellers, we need to remember that we are providing input into a much larger, and more complex, political process. We need to accept that our models will inform the solutions to wicked problems, and we need to work closely with policy makers to inform their policies, without claiming that we are solving them.

### 3.4 Modelling using big data

Big data is an area that is currently ripe for spatial microsimulation models. We are now seeing a proliferation of big data from smart cards used in transport systems, smart electricity meters, and other IoT devices in many areas. Smart cities, where free parking spaces show up in an app, and CO2 levels are monitored in real time, are now a reality.

Given much of this data is spatial, there are obvious opportunities to map the data, and visualise the patterns within a city (Charles-Edwards & Bell, 2013; Zhong, Manley, Muller Arisona, Batty, & Schmitt, 2015). Recent work in the UK has shown how electricity consumption from smart meters can support the collection of population statistics (Anderson, et al., 2017; Newing, Anderson, Bahaj, & James, 2016).

Going beyond simple mapping of this data, one area of potential growth in the next few years is looking at how these huge amounts of data can feed into spatial microsimulation models to increase their accuracy, impute other information, or assist projections of policy. An example that may be considered is using IoT devices to collect information on CO2 emission in a street, and then using weather projections of wind force and direction to project spatial levels of CO2 in particular areas at a particular time. The projections of wind force and directions could then be varied to provide a number of different scenarios, based on different seasons, to feed into policy decisions about road accessibility at certain times of the day.

## 4. Conclusions

This paper has provided a summary of the latest developments in spatial microsimulation, as well as providing some suggestions on where the next steps may be. One of the exciting aspects of spatial microsimulation is that it is used across many different disciplines, which means that development can come from various directions.

In terms of method development, these continue, although most developments are refinements of current approaches. The process of estimating confidence intervals is still an area that needs some testing. While recent developments have been made, some of these are unmanageable in terms of the time taken and the size of the final file (Tanton, 2015), and confidence intervals using the Z statistic (Rahman, 2017b) can only be applied to an indicator where reliable small area data is currently available. The method suggested by (Whitworth et al., 2017) is very new, and requires further testing, but has some potential.

In terms of areas of development and potential new work, I have identified four areas, being further work on bedding down a method for estimating confidence intervals; linking models from different disciplines to tackle wicked problems; making results from spatial microsimulation models more mainstream in terms of public policy; and using big data from smart cities in spatial microsimulation models to either increase accuracy, or incorporate into simulations. All of these are areas have potential for development over the next few years, but this list is certainly not exhaustive. Given the inter-disciplinary nature of spatial microsimulation, we can expect to see many exciting developments over the next few years, as there has been over the last three years.

## References

1. 1
Electricity consumption and household characteristics: Implications for census-taking in a smart metered future
(2017)
Computers, Environment and Urban Systems 63:58–67.
2. 2
Survey and replicate weights: 1406.0.55.002 - User Manual: ABS Remote Access Data Laboratoy (RADL), Mar 2006 (2006)
Survey and replicate weights: 1406.0.55.002 - User Manual: ABS Remote Access Data Laboratoy (RADL), Mar 2006, http://www.abs.gov.au/ausstats/abs@.nsf/Lookup/3607C2551414E995CA257A5D000F7 C5D.
3. 3
The Local Implications of Major Job Transformations in the City: A Spatial Microsimulation Approach
(2001)
Geographical Analysis 33:291–311.
4. 4
Exploring microsimulation methodologies for the estimation of household attributes
(1999)
Geography pp. 1–46.
5. 5
The New Science of Cities
(2013)
The MIT Press.
6. 6
Income Tax in Scotland: 2017 update (2017)
Income Tax in Scotland: 2017 update, https://sp-bpr-en-prod-cdnep.azureedge.net/published/2017/12/6/Income-Tax-in-Scotland--2017-update/SB 17-84.pdf.
7. 7
Statistical methods for assessing agreement between two methods of clinical measurement
(1986)
Lancet 1:307–310.
8. 8
Measuring agreement in method comparison studies
(1999)
Statistical Methods in Medical Research 8:135–160.
9. 9
Estimating the Service Population of a Large Metropolitan University Campus
(2013)
Applied Spatial Analysis and Policy 6:209–228.
10. 10
Microsimulation for Urban and Regional Policy Analysis
(1996)
Pion.
11. 11
Estimating small area demand for water: A new methodology
(1997)
Water and Environment Journal 11:186–192.
12. 12
The Dynamics of Urban Spatial Structure: The Progress of a Research Programme
(1985)
Transactions of the Institute of British Geographers 10:427–451.
13. 13
Models, Uncertainty and Confidence Intervals
(1991)
Improving Information for Social Policy Decisions – The Uses of Microsimulation Modeling: Review and Recommendations pp. 89–95.
14. 14
Intergenerational report 2007 (2007)
Intergenerational report 2007, http://archive.treasury.gov.au/igr/igr2007/report/PDF/IGR_2007_final_report.pdf.
15. 15
Confidence intervals for policy reforms in behavioural tax microsimulation modelling
(2007)
Bulletin of Economic Research 59:37–65.
16. 16
A Spatial Microsimulation Model for the Estimation of Heat Demand in Hamburg. In REAL CORP 2016 – SMART ME UP! How to become and how to stay a Smart City, and does this improve quality of life?
(2016)
Proceedings of 21st International Conference on Urban Planning, Regional Development and Information Society pp. 39–46.
17. 17
Spatial microsimulation estimates of household income distributions in London boroughs, 2001 and 2011 (CASE Papers No. 196)
(2016)
Centre for Analysis of Social Exclusion.
18. 18
Migration in Sweden: a Symposium
(1957)
27–158, Migration and Area, Migration in Sweden: a Symposium, Lund Studies in Geography, Series B, 13.
19. 19
Innovation diffusion as a spatial process
(1967)
Innovation diffusion as a spatial process, University of Chicago Press.
20. 20
Improving Work Incentives and Incomes for Parents: The National and Geographic Impact of Liberalising the Family Tax Benefit Income Test
(2009)
Economic Record 85:S48–S58.
21. 21
Wicked Problems in Public Policy
(2008)
Public Policy 3:101–118.
22. 22
Sequential linking of Computable General Equilibrium and microsimulation models: a comparison of behavioural and reweighting techniques
(2010)
International Journal of Microsimulation 3:35–42.
23. 23
Small area estimates of smoking prevalence in London. Testing the effect of input data
(2012)
Health and Place 18:630–638.
24. 24
A spatial micro-simulation analysis of methane emissions from Irish agriculture
(2009)
Ecological Complexity 6:135–146.
25. 25
Predicting The Need For Aged Care Services At The Small Area Level: The CAREMOD Spatial Microsimulation Model
(2009)
International Journal of Microsimulation 2:27–42.
26. 26
Synthetic Population Generation with Multilevel Controls: A Fitness-Based Synthesis Approach and Validations
(2015a)
Computer-Aided Civil and Infrastructure Engineering 30:135–150.
27. 27
Synthetic Population Generation with Multilevel Controls: A Fitness-Based Synthesis Approach and Validations
(2015b)
Computer-Aided Civil and Infrastructure Engineering 30:135–150.
28. 28
CAPITA-B: A Behavioural Microsimulation Model
(2016)
CAPITA-B: A Behavioural Microsimulation Model, Canberra.
29. 29
An unconstrained statistical matching algorithm for combining individual and household level geo-specific census and survey data
(2017)
Computers, Environment and Urban Systems 63:3–14.
30. 30
The Role of Digital Trace Data in Supporting the Collection of Population Statistics - the Case for Smart Metered Electricity Consumption Data
(2016)
Population, Space and Place 22:849–863.
31. 31
Spatial Microsimulation Modelling: a Review of Applications and Methodological Choices
(2014)
International Journal of Microsimulation 7:26–75.
32. 32
A New Type of Socio-Economic System
(1957)
The Review of Economics and Statistics 39:116.
33. 33
An institutional perspective on programme integration
(2017)
Policy Studies 38:418–431.
34. 34
A Fine Grained Hybrid Spatial Microsimulation Technique for Generating Detailed Synthetic Individuals from Multiple Data Sources: An Application to Walking and Cycling
(2017a)
International Journal of Microsimulation 10:167–200.
35. 35
A Fine Grained Hybrid Spatial Microsimulation Technique for Generating Detailed Synthetic Individuals from Multiple Data Sources: An Application to Walking and Cycling
(2017b)
International Journal of Microsimulation 10:167–200.
36. 36
Small area housing stress estimation in Australia: Calculating confidence intervals for a spatial microsimulation model
(2017a)
Communications in Statistics – Simulation and Computation 46:7466–7484.
37. 37
Small area housing stress estimation in Australia: calculating confidence intervals for a spatial microsimulation model
(2017b)
Communications in Statistics – Simulation and Computation 46:7466–7484.
38. 38
Modelling the economic, social and ecological links in the Murray-Darling basin: a conceptual framework
(2015)
The Australasian Journal of Regional Studies 21:80–102.
39. 39
Validation of spatiodemographic estimates produced through data fusion of small area census records and household microdata
(2017)
Computers, Environment and Urban Systems 63:38–49.
40. 40
41. 41
Projections of the Diversity of the Canadian population 2006 to 2031. Statistics Canada Catalogue No. 91-551-X
(2010)
Projections of the Diversity of the Canadian population 2006 to 2031. Statistics Canada Catalogue No. 91-551-X.
42. 42
Population Projections by Aboriginal Identity in Canada, 2006 to 2031. Statistics Canada Catalogue No. 91-552-X
(2011)
Population Projections by Aboriginal Identity in Canada, 2006 to 2031. Statistics Canada Catalogue No. 91-552-X.
43. 43
CAPITA – Treasury’s microsimulation model of personal income tax and transfers (Treasury Working Paper)
(2017)
CAPITA – Treasury’s microsimulation model of personal income tax and transfers (Treasury Working Paper), Canberra.
44. 44
A Review of Spatial Microsimulation Methods
(2014)
International Journal of Microsimulation 7:4–25.
45. 45
Estimating confidence intervals for spatial microsimulation using replicate weights
(2015)
Paper presented at the International Microsimulation Association Conference.
46. 46
Spatial Models
(2014)
In: C O’Donoghue, editors. Handbook of Microsimulation Modelling. Emerald. pp. 367–383.
47. 47
A framework for integrating collaborative city design with individual centred modelling, Agent Based Modelling of Urban Systems, Springer
(2017)
A framework for integrating collaborative city design with individual centred modelling, Agent Based Modelling of Urban Systems, Springer.
48. 48
Population Change and Impacts in Asia and the Pacific
(2018)
Using Spatial Microsimulation to Derive a Base File for a Spatial Decision Support System, Population Change and Impacts in Asia and the Pacific.
49. 49
Old, Single and Poor: Using Microsimulation and Microdata to Analyse Poverty and the Impact of Policy Change among Older Australians. Economic Papers: A Journal of Applied Economics and Policy (2009)
102–120, Old, Single and Poor: Using Microsimulation and Microdata to Analyse Poverty and the Impact of Policy Change among Older Australians. Economic Papers: A Journal of Applied Economics and Policy, 28, 2, http://doi.org/10.1111/j.1759-3441.2009.00022.x.
50. 50
Disadvantage in the Australian Capital Territory
(2015)
Policy Studies 36:92–113.
51. 51
Small Area Estimation Using A Reweighting Algorithm
(2011)
Royal Statistical Society. Journal. Series A: Statistics in Society 174:931–951.
52. 52
Comparing two methods of reweighting a survey file to small area data
(2014)
International Journal of Microsimulation 7:76–99.
53. 53
Validation of Spatial Microsimulation Models : a Proposal to Adopt the Bland-Altman Method
(2016)
International Journal of Microsimulation 9:106–122.
54. 54
Estimating uncertainty in spatial microsimulation approaches to small area estimation: A new approach to solving an old problem
(2017)
Computers, Environment and Urban Systems 63:50–57.
55. 55
The Estimation of Population Microdata by Using Data from Small Area Statistics and Samples of Anonymised Records
(1998)
Environment and Planning A: Economy and Space 30:785–816.
56. 56
A New Representation of the Urban System for Modelling and for the Study of Micro-Level Interdependence
(1976)
Area 8:246–254.
57. 57
Measuring variability of mobility patterns from multiday smart-card data
(2015)
Journal of Computational Science 9:125–130.

## Article and author information

### Author details

1. #### Robert Tanton

National Centre for Social and Economic Modelling, Institute for Governance and Policy Analysis, University of Canberra, Australia
##### For correspondence
robert.tanton@canberra.edu.au

### Publication history

1. Version of Record published: April 30, 2018 (version 1)