
Can a Mayor Change the Course of a Pandemic? An Agent-Based Study on the COVID Spread on Local Level in Germany
Abstract
During the COVID-19 pandemic, a large variance of incidence rates on local level, e.g. cities and districts, within one country have been observed, while the same non-pharmaceutical measures have been taken to control the spread of the virus. This variance in incidence rates triggered the question, if the spread of incidence rates can be explained only by statistical processes and the local population statistics or if indeed other factors, e.g. local information campaigns, have to be considered. Within this paper we study the expected spread of incidence rates in the German State of Rhineland Palatinate during the second COVID-19 wave using an agent based simulation. We find that our agent-based model is able to replicate the observed incidence rates per district by only considering state-wide policies. While this does not imply that local policies or specific local political circumstances have no impact on local infection rates, it suggests that the dominant factors driving the distribution of infection rates across a state are primarily statistical effects, demographics, and state-wide policies.
1. Introduction
Amidst the global COVID-19 pandemic, nations took various measures to curb the virus’s spread. These encompassed extensive lockdowns, movement restrictions, social distancing guidelines, quarantine procedures, mask mandates, and robust testing and contact tracing initiatives. Travel restrictions were enforced, while remote work and online education were promoted. Most measures have been taken at state- or national wide level, however, still a significant variance on the actual infection rates on local level, e.g. districts or cities has been observed. This triggered the assumption, mainly in media, that some kind of special local circumstances, e.g. the impact of local politics, testing- or information campaigns might cause this variance (STERN, 2021; Zeitung, 2020).
Within this study we want to test if it is possible to model the observed local variations on incident rates with an ABM model, that considers only statistical and demographic facts as well as state policies during the pandemic, but not any form of dedicated local voluntary campaigns. We have chosen the state of Rhineland Palatinate as a testing environment for this hypothesis. Given that Rhineland Palatinate is a medium sized state within Germany with 4.1 million inhabitants, including larger cities and more rural areas, we would argue that the general conclusion should stay largely valid for the whole of Germany. It should be noted that legal restrictions in Germany during the COVID-19 pandemic could only be issued on national or state-level, e.g. for the whole of Rhineland Palatinate. Local or regional measures within the state of Rhineland Palatinate where only on recommendation basis without any legal implications.
Mathematical simulations and forecasts for epidemic or pandemic evolution commonly rely on compartmental models or agent-based models. Compartmental models categorise the population into compartments (e.g., susceptible, exposed, infected, recovered) and employ differential equations to depict transitions between these states (e.g., SIR model (Wang et al., 2016; Roda et al., 2020)). In contrast, agent-based models simulate individual agents with specific characteristics and behaviours, providing a more intricate representation of interactions and spatial dynamics (Bullock, 2021; Pillai et al., 2023; Vyklyuk et al., 2021). We chose the latter modelling approach for our study, given that agent based models can describe detailed population statistics and the interaction between agents as well as their commuting behaviour.
The agent-based simulation, which is used for our study is briefly described in Section 2, followed by a summary of the predictive power of the framework within RLP during the second COVID-19 wave between October 2020 and February 2021 (Section 3). The observed statistical spread of infection incidences is discussed and interpreted in Section 4, followed by a brief conclusion.
2. The June-Germany Framework
The June framework, introduced by Bullock et al. in 2021 (Bullock, 2021), is an agent-based model designed to simulate epidemics in a population, specifically focusing on the initial and subsequent waves of the COVID-19 pandemic. Notably, this model incorporates detailed geographic and sociological data for England. June has demonstrated its ability to accurately forecast the geographical and sociological dynamics of COVID-19 transmission, as discussed in detail in the original publication (Bullock, 2021).
Building upon the success of June, June-Germany (Akdogan et al., 2023) adapts the framework for application in Germany. The underlying modelling approach for the different aspects of the epidemic simulation, such as contact patterns, infection processes, sociodemographic population representation or governmental policies is not changed. However, June-Germany allows for the modelling of different virus strains as well as vaccination campaigns. Like its predecessor, it is implemented in Python and structured primarily into four interconnected layers:
Population Layer: This layer details individual agents, their static social environments (e.g., households, workplaces), and demographics across hierarchical geographic layers. Agents follow daily routines in discrete time-steps, associated with specific households, schools, and workplaces.
Interaction Layer: Captures daily routines such as commuting and leisure activities. Social contact networks define interactions, and disease transmission during public transportation is considered. Age-dependent social interaction matrices model contact frequency and intensity in various settings.
Disease Layer: Models disease transmission and effects, utilising probabilistic infection modelling that considers factors like transmissive probability, susceptibility, and exposure time. Health impacts range from asymptomatic cases to ICU admission and potentially lethal outcomes.
Policy Layer: Incorporates government policies for pandemic mitigation at localised levels, considering geographical regions and social interactions. This allows for the modelling of essential workers’ activities and general population compliance, with agent compliance influenced by social and demographic parameters.
While the interaction layer is not changed, we expanded desease layer of the June-Germany framework to allow for the modelling of different virus strains as well as vaccination campaigns. Moreover, the underlying data for the population and policy layer has been adapted. In particular, the policy layer includes all policies that have been introduced at various stages during the course of the pandemic by either the local state or the federal government. This includes for example the closure of schools and universities, as well as selected companies or industrial branches. When necessary, the active workforce has been reduced as well as changes in leisure activities are considered to reflect the contact restrictions.
The geographical model is based on German administrative areas, featuring three layers: states, districts, and municipalities, with detailed demographic data from the 2011 Census (Zensus, 2011), which provides informa- tion in the age groups 0-4, 5-14, 15-34, 35-59, and ≥ 60. Population densities vary based on age and sex distribution, considering age as a significant risk factor for severe COVID-19 cases.
Household compositions are also derived from the 2011 Census data, categorised by the number of adults and children in each household. The simulation includes 14,502 primary schools (average 204 students per school) and 13,068 secondary schools (average 506 students per school). A teacher-to-student ratio of 0.12 and class sizes between 20 and 30 students are assumed. Agents are distributed to simulated schools based on their home addresses.
Jobs are classified by sector using the International Standard Industrial Classification, with companies modelled in each district based on the average number of employees for a sector. Workplace assignments during population generation consider mobility data.
Social activities and interactions are modelled similarly in both June and June-Germany. Agents’ weekday routines involve work/school, shopping, leisure, and staying at home. Beyond working hours, social activities include visits to cinemas, theatres, pubs, and restaurants, contingent on current state regulations and individual compliance. Commuting is represented by a directed network graph, accounting for both short-distance and long-distance travels.
The modelling of COVID-19 as well as the typical interactions of agents are assumed to be unchanged compared to the UK in the original June model.
3. Simulation of the COVID-19 Pandemic in the State of Rhineland Palatinate
We utilised June-Germany to model the second wave of the Covid-19 Pandemic in the German state of Rhine-land-Palatinate spanning from October 2020 to February 2021. The cumulative deaths in the age groups 0-4, 5-14, 15-34, 35-59, and ≥ 60 from October 1, 2020, to December 14, 2021, served as the basis for determining optimal model parameters. Of particular significance were the cumulative deaths in the age group above 35, as minimal deaths were reported for the younger population. The chosen age groups reflect the reported age groups from the official government data used to fit the model.
Given the intricacy of the JUNE simulator, characterised by a substantial dimension in both input parameters and output space, we employed emulation and history matching (Andrianakis et al., 2017; Bower et al., 2010) to identify suitable matches to the data. The underlying problem is the large dimensionality of the input space, and the significant required computing time to predict one output distribution for a given set of input parameters. History matching is a (pre)calibration technique that has proven successful with complex deterministic models and can identify the subset of the input space that could give rise to acceptable matches between model output and measured data. It relies on a Bayes Linear approach, meaning simplifying the underlying model and thus being able to identify potentially interesting regions in phase space, which have to be considered further in the full simulation, yielding a significant improvement in computing time. To facilitate this process, we utilised the in-development R package hmer (Iskauskas et al., 2022), designed to streamline emulator construction and the generation of representative parameter sets for subsequent iterations of emulation and history matching. This package has been employed successfully for parameter estimation in other epidemiological scenarios. The emulation and history matching framework presents several advantages over traditional parameter estimation methods, with a notable benefit being the requirement of relatively few evaluations from the computationally expensive simulator to train an emulator. Following the optimisation of model parameters using data up to December 14, 2021, we projected the entire second wave until February 22, 2022.
The comparison of the hospitalisation rate for the last 7 days is depicted in Figure 1, revealing good agreement, albeit with the simulation predicting a faster decline of the wave than observed in the data.1 The total number of predicted hospitalised patients during the entire second wave is 5181, compared to the official number of 5638. Notably, a less precise prediction is evident for the incidence rate. Although the number of infections is accurately described for the age group ≥ 60, discrepancies by a factor of about three emerge for the age groups 14-34 and 35-59, indicating a number of unreported cases. This ratio increases significantly for the 5-14 age group. While a general trend toward unreported cases is anticipated, the extent of the observed underreporting appears surprising. A comprehensive discussion can be found in Akdogan et al. (2023).

Hospitalization rate of the last 7 days (red line) for the age group 35-59 (left) and ≥ 60 (right), together with the June-Germany simulation with the best fitting parameters (blue line), its statistical uncertainty (shaded blue) and the simulated curves of alternative sets of model parameters tested during the fitting procedure (gray). The vertical line indicates the date until when the data was fitted.
In order to test the effect of state-wide policies on the simulated events, we performed several additional studies. Exemplary we discuss, the overall hospitalisation rate for the last 7 days across all age groups, when no policies would have been applied, as well as when no school closing policies are enforced. As shown in Figure 2, the total number hospitalisations over the simulation period increases drastically in comparison to the observed value, when no policies are applied. In fact, nearly all inhabitants get infected in this scenario. When repeating the fits to data, without assuming any school closing measures, we find an increase by a factor of 2.5±0.05 on the incidence rate. The smallest observed increase was a factor of 2.4, the largest observed increase a factor of 2.7 among 30 studied scenarios. The increase on the hospitalisation rate and death rate is consistent, yielding factors of 2.6±0.2 and 2.6±0.1. Further details are discussed in Heger et al. (2023). The effects of the policies in the ABM simulation are therefore imminent. To further improve and validate our model in the future we foresee use public data on mobility as well as by considering recent work on ensemble-based data assimilation (Cocucci et al., 2022; Aleta et al., 2022; Chang et al., 2021).
4. Modelling of Regional Spreads
Given the observed mismatch between the reported incidence rate and the simulated incidence rate, we first scale the simulated incidence rate to the observed rate with the same global factor across the full population statistics and districts. The relative spread in incidence rates is therefore not affected, however, it simplifies the comparison to the reported data.
Rhineland Palatinate is structured in 36 districts (Landkreis) and cities. The smallest district is Zwei-brücken with 34k inhabitants, the largest is the city of Mainz as capital with 220k inhabitants. It is important to note, that the optimisation of model parameters is based solely on the overall cumulative deaths rate over all districts and cities and no separate local information has been used. The resulting predictions on the variance among the 36 districts is therefore unbiased.
The overall number of reported infections between October 2020 and February 2021 in each of the 36 districts and cities is shown in Figure 3. The uncertainty on the reported numbers is assumed to follow Poisson statistics. Also shown are the simulated numbers of June-Germany. The indicated uncertainties corresponds to 1 (i.e. 68%) confidence intervals and have a statistical and a systematic component. The statistical uncertainties are estimated by rerunning the simulation with varying the number as well as the location of agents that are treated infected at the beginning. The resulting distribution of infected agents for each simulation at each time-step is gaussian and the 1 interval is taken as statistical uncertainty. The systematic uncertainties are due to uncertainties on modelling parameters after the fitting procedure to data. The uncertainties of parameters, that are not fitted, are taken from the original June publication and are either based on other studies or based on expert evaluations. A summary of the most important model parameters, their central values as well as their 95% confidence interval is given in Table 1. The simulation was rerun several hundert times, where the model parameters were varied randomly within their associated uncertainties, where it was assumed that each model parameter is gaussian distributed around its initial values and the 95%confidence interval corresponds to a variation of 2 , i.e. two standard deviations. As expected, districts and cities with larger population have more cases than those with smaller populations. This also explains the good description of the simulated values and the reported values.

Overall number of infections for all 36 districts of Rhineland Palatinate between October 2020 and February 2021, once for the official reported cases and once by the June-Germany simulation.
Overview of the most relevant contact intensity parameter of June framework. Together with the infectiousness of the infectors at given time t, the susceptibility of the potential infectee as well as the exposure time interval when two or more agents meet in the simulation, they build the basis for the probability that an agent gets infected. Technical details on the implementation are discussed in Bullock (2021). In addition, their central values used in the final simulation as well as their 95% CL interval are given.
Contact intensity parameter | Mean-Value | 95% CL interval | Contact intensity parameter | Mean-Value | 95% CL interval |
---|---|---|---|---|---|
care home | 0.28 | [0.16,0.40] | hospital | 0.19 | [0.11,0.27] |
care visits | 6.13 | [3.60,8.66] | household | 0.30 | [0.42,0.18] |
cinema | 0.52 | [0.30,0.73] | household visits | 0.55 | [0.32,0.78] |
city transport | 0.17 | [0.10,0.24] | inter city transport | 0.17 | [0.10,0.24] |
company | 0.32 | [0.19,0.45] | pub | 0.42 | [0.24,0.60] |
grocery | 0.48 | [0.28,0.67] | school | 0.32 | [0.19,0.45] |
gym | 0.40 | [0.23,0.57] | university | 0.17 | [0.10,0.24] |
Hence it is more instructive to compare the number of infections normalised to 10k inhabitants in each district or city, as shown in Figure 4. Also here, a very good agreement can be seen. A Kolmogorov-Smirnov test on the simulated and observed distribution yields a value of 0.8, i.e. the null hypothesis that the simulated and observed distributions have the same underlying distribution, cannot be excluded. Moreover, a more rigorous analysis using the chi-squared between the simulated and observed distributions, while taken into account the correlations among the systematic uncertainties, yields a value of 44 for 35 degrees of freedom. This corresponds to a p-value of 0.15 and indicating a good agreement. Hence it is plausible to assume that a significant disagreement between the observed and simulated distributions can be excluded.

Normalized number of infections per 10k inhabitants for all 36 districts of Rhineland Palatinate between October 2020 and February 2021, once for the official reported cases and once by the June-Germany simulation.
The distribution of the overall numbers infections as well as the normalized number of infections for the observed cases as well as the nominal simulation is shown in Figure 5. Again, both distributions agree within their uncertainties. A Kolmogorov-Smirnov test on the simulated and observed distribution yields a value of 0.9 and 0.8, respectively, i.e. the null hypothesis that the simulated and observed distributions have the same underlying distribution, cannot be excluded. The better agreement for the overall number of infections (Figure 5a) can be simply explained by the large correlation to the number of inhabitants in each district. The agreement between the two distributions does not significantly change, when ignoring all systematic uncertainties on the model parameters. This indicates, that the statistical uncertainties on the predicted values are dominant.

Spread of the overall number of infections (a) and spread of the number of infections per 10k inhabitants (b) in all 36 districts of Rhineland Palatinate between October 2020 and February 2021, once for the official reported cases and once by the June-Germany simulation.
While the observed mean of infections per 10k inhabitants is 261 ± 9, the simulated mean is 256 ± 9. Even more interestingly, also the spread of the distributions agree. Given that the underlying distributions might not be Gaussian, we take the root-mean-square (rms) value as a measure for the spread. The observed spread is rms=54 ± 6, which agrees well with the simulated rms is 57 ± 7, where the reported uncertainty is purely statistical.
In order to understand if the spread of the distributions is only statistical or if social demographics have to be considered, it is illustrative to study the correlation between the observed and simulated numbers. Figure 6 shows the correlation of the overall number of infections per district as well as the number of infections per 10k inhabitants between October 2020 and February 2021. The districts differ by the number of inhabitants, by their socio-demographics as well as potential other factors such as local policies, available information at local level as well as different local behaviours. The size of the population in the different districts is accurately modelled and also the most important socio-demographic factors, such a population densities, house-hold sizes as well as age-distributions, as discussed in section 2.

Correlation between the observed and simulated overall number of infections per district (left) and the number of infections per 10k inhabitants between October 2020 and February 2021.
A Pearson correlation of 0.8 ± 0.1 is found for the overall number of infections (Figure 6a) and is result of the strong correlation to number of inhabitants per districts, which is accurately modelled. However, a Pearson correlation coefficient of 0.5 ± 0.1 is observed for the predicted and simulation number of infections normalized to 10k inhabitants (Figure 6b), i.e. when the effect of the different population sizes has been removed. This significant correlation implies that pure statistical effects do not describe the observed variance among the districts, since a correlation coefficient consistent with 0 would be expected.
The observed correlation does not allow for a definitive statement about whether the infection rates across different districts in RLP are accurately described by the simulation. If no correlation would be observed, then it can only be concluded that the infection rates are based purely on random factors. The relevant question for our hypothesis is, if there are any districts that have a significantly larger or smaller infection rate by 10k inhabitants, compared to the expectation from the simulation. If that would be the case, then it can be concluded that there are factors playing a role, which are not modelled.
However, Figure 5 demonstrates an accurate representation of the spread of incident rates, suggesting that the environmental conditions and statistical effects considered in the simulation are sufficient to reproduce the observed data. The agent based model June-Germany is therefore capable to describe the spread of infection numbers on local level, only using basic population statistics as well as common state-wide regulations without incooperating any specific local measures, such as information campaigns by the local authorities.
Does this imply that local authorities have no impact at all on the overall progression of the pandemic? To answer this question, it is important to note that the uncertainties in the reported infection numbers, as well as the limited number of districts studied in our simulation, do not allow for a definitive conclusion regarding the impact of local policies. To test a potential local effect, one could examine, for example, the outcome of a simulation in which only two-thirds of agents aged 5–14 in the district of Mainz adhere to the state-level school closure policy, while the remaining agents in this age group continue to meet in larger groups. This scenario results in a higher infection rate for the district of Mainz compared to the standard simulation setup. However, it would still be compatible with the observed spread across all districts within the 95%confidence level. Thus, local policies might still have an influence, however, the expected spread of infection rates due to statistical effects and population demographics is so large that many local measures are likely to be obscured by state-level trends and purely statistical factors.
5. Summary
In this paper we use the June-Germany framework to predict the spread of infections within the German state of Rhineland Palatinate within the second COVID-19 wave from October 2020 and February 2021. The simulation was tuned on the overall number of confirmed COVID-19 death cases during October 2020 and mid of December 2021, however, no specific tuning on district level has been performed. We observe a good description of the spread of incidences per 10k inhabitants in all 36 districts and cities of Rhineland Palatinate. We observe a correlation coefficient of 0.5 between the simulated and measured incidences per 10k inhabitants. No specific local measures, such as dedicated informationcampaigns, special testing or tracing campaigns have been implemented in the simulation. Nevertheless, the observed incidence rates per district can be described by the simulation. This suggests that the incidence rates per district are primarily driven by statistical effects, population statistics and state-wide regulations.
Footnotes
1.
In fact, none of our simulations, nor variations in the assumed contact patterns or interplay between age groups could account for the deviations for the oldest age group between mid to end of January 2022. This deviation is not present in the raw infection data in this age group and shows only in the hospitalisation. However, we argue that this mismodelling has little effect on the actual test of our hypothesis, since the mismodelling is present in all local districts and cancels when comparing ratios to first order.
References
-
1
JUNE-Germany: An Agent-Based Epidemiology Simulation Including Multiple Virus Strains, Vaccinations and Testing CampaignsarXiv:2303.05742.https://doi.org/10.48550/arXiv.2303.05742
-
2
Quantifying the Importance and Location of SARS-CoV-2 Transmission Events in Large Metropolitan Areas.”Proceedings of the National Academy of Sciences119e2112182119, 26, 10.1073/pnas.2112182119.
-
3
History matching of a complex epidemiological model of human immunodeficiency virus transmission by using variance emulationJournal of the Royal Statistical Society. Series C, Applied Statistics 66:717–740.https://doi.org/10.1111/rssc.12198
-
4
Galaxy formation: A Bayesian uncertainty analysisBayesian Analysis 5:619–669.https://doi.org/10.1214/10-BA524
-
5
JUNE: Open-Source Individual-Based Epidemiology SimulationRoyal Society Open Science.
-
6
Mobility network models of COVID-19 explain inequities and inform reopeningNature 589:82–87.https://doi.org/10.1038/s41586-020-2923-3
-
7
Inference in epidemiological agent-based models using ensemble-based data assimilationPlos One 17:e0264892.https://doi.org/10.1371/journal.pone.0264892
-
8
On the impact of school closures on COVID-19 transmission in Germany using an agent-based simulationarXiv:2312.11338.https://doi.org/10.48550/arXiv.2312.11338
-
9
Emulation and history matching using the hmer packagearXiv:2209.05265.https://doi.org/10.48550/arXiv.2209.05265
-
10
Agent-based modeling of the Covid 19 Pandemic in FloridaarXiv:2306.11003.https://doi.org/10.48550/arXiv.2306.11003
-
11
Why is it difficult to accurately pry is it difficult to accurately predict the COVID-19 epidemic?”Infectious disease modelling510.1016/j.idm.2020.03.001.
-
12
Warum hat rostock so wenig infizierte? der danische burgermeister erklart, was er anders machthttps://www.stern.de/wirtschaft/stunde-nullbuergermeister-von-rostock-erklaert–warum-beiihm-die-inzidenz-so-niedrig-ist-30006730.html, Accessed, 23 November 2023.
-
13
Modeling and analysis of different scenarios for the spread of COVID-19 by using the modified multi-agent systems - Evidence from the selected countriesResults in Physics 20:103662.https://doi.org/10.1016/j.rinp.2020.103662
-
14
Improved SIR epiDEM Model of Social Network Marketing Effectiveness and Experimental Simulation.”Systems Engineering - Theory and Practice36Results Phys 103662.https://doi.org/10.1016/j.rinp.2020.103662
-
15
Kann deutschland von tübingen lernen?https://www.faz.net/aktuell/politik/inland/umgang mit-corona-kann-deutschland-von-tuebingen lernen-17094239.html, Accessed, 23 November 2023.
- 16
Article and author information
Author details
Funding
This work has been supported by the Johannes Gutenberg Startup Research Fund of the University of Mainz.
Acknowledgements
This work has been supported by the Johannes Gutenberg Startup Research Fund and would have not been possible with the ERC Grant LightAtLHC. Part of the simulations were conducted using the supercomputer Mogon II at Johannes Gutenberg University Mainz. The authors gratefully acknowledge the computing time granted on the supercomputer.
Publication history
- Version of Record published: August 20, 2025 (version 1)
Copyright
© 2025, Kirn et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.