Pandemic Simulation

Approach to the problem

Due to the huge dimensionality of the problem’s parameters, that in our opinion would have prevented the success of an analytical approach, a numerical/statistical method has been followed. In fact we developed Java and C++ tools both to simulate several disease models/scenarios and to retrieve and statistically treat the output data.
Another important point of our strategy has been to adopt a matrix point of view of the situation. The structure and the behaviour of this matrix changed during the study of the problem. In fact we started with a static approach where every cell represented a person who can’t make any movement and than we switched to a more realistic simulation, adding the possibility of movement and vaccines which inhibit the spread of the disease.
Our Java application works with the input parameters mentioned in the problem statement: the infection, healing and death probabilities. We also decided to develop our simulation based on some real epidemics and to compare our results with the ones expected from the aforesaid diseases. So we extracted some data for medical literature and adapted them to the ones requested in the problem statement.
2 Simulation tools
Our simulation has been developed firstly using a C++ algorithm, and then we switched to a Java application in order to use a graphical user interface (GUI).
2.1 C++ Algorithm
This application has been used to give us a quick idea of the behaviour and the spreading of the infection in a static model in which all the people are in contact each other (that will be thereafter referred to as the rock concert model). In this version of the program the basic logic of transmission of the disease has been sketched out.
Our first concern was to define a metric on the matrix: we decided to consider all the 8 cells neighboring to a chosen one as adjacent, and capable of infecting it as a consequence.
We also decided to let the disease spread from a single cell (the zero patient) located in the center of the matrix. In our opinion this choice does not involve a significant loss of generality, because in the real case the disease would be able to spread in a world large enough not to depend in a significant way on the starting position.
We finally agreed in fixing some additional rules regarding the behaviour of the simulation:
– healed people are immune from the disease and can no longer spread it;
– the illness cannot even be spread by dead people;
– the epidemic is thus to be considered exhausted when there are no longer any infected people in the
The better to follow the spread of the disease, we also decided to adopt a graphical approach according to the following numerical and chromatic codes:

At each time step (arbitrarily assumed equal to one day) the status of every cell is updated according to the input probabilities. The algorithm proceeds evaluating the status of each cell of the matrix scanning it by row and putting the updated values on a transport matrix. Every status change is evaluated generating a random number which is then compared with the corresponding probability.
The possible states are checked in the following order and with the corresponding outcomes:
– if the cell’s status s = 3 or s = 2 (the patient is dead or has already healed) the value remains the
– if the cell’s status s = 1 the patient is diseased and it can die (s = 3), heal (s = 2) or stay infected
(s = 1) according to the corresponding input probabilities;
– if the cell’s status s = 0 (the person is still healty) it can be infected depending on the corresponding
probability and the number of neighboring patient.
In the last case, in order to greatly reduce the number of random extractions (and thus improve the run time as a consequence) we searched a way to evaluate a total infection probability dependent on the number of adjacent patients. We figured out that this probability can be evaluated as complementary to that of resisting the attempts of infection by all the neighboring patients, as shown in figure (1).

Since the process is essentially binomial in nature, it’s in fact difficult to determine a direct formula for the cumulative probability as the sum of gradually smaller contributions, while it’s way easier to determine the overall probability of not being infected after n attempts, which turns out to be:

Hn = (1 − i)n;

thus resulting in a total infection probability given by:

In = 1 − (1 − i)n. (1)

Java Algorithm

In order to simulate more complex scenarios, a Java algorithm that included the following features has then been developed:
– the possibility to take into account the effects of vaccination with variable coverage;
– the movement of the people with different means of transportation;
– the dimension of the matrix and its population;
– a random starting position for the zero patient;
– the existence of various transmission routes corresponding to different infection radii;
– a Graphical User Interface (GUI) the better to handle all the aforesaid options.

3 Simulated models
First and foremost, we had to establish some values of the probabilities to run the simulation. Initially we defined the geometric locus in which the three probabilities had to stay in:
– The probability of infection i can assume values from 0 to 100% ;
– The sum of the probabilities of healing and death can assume values from 0 to 100% .
These conditions define a triangular prism, in which every point correspond to a triplet of probabilities that can :

We estimated some probabilities to run the program taking some data from medical literature. In particular, we analyzed the basic reproductive ratio R0, the average course of illness N and the mortality rate D of seven real diseases, given that:
– Basic reproductive ratio R0: average number of new cases that an ill patient generates during the
course of illness (considered as the whole duration of the disease);
– Course of illness N: average number of days that a disease lasts for, after which a patient can be
considered eventually healed or died;
– Mortality rate D: percentage of infected people that are dead after the course of the disease.
We summed up the data in a table:

These data have then been used to calculate the three probabilities needed:
– Probability of infection i: considered R0 as the ratio between the number of new cases that a pa- tient generates during the course of the disease N and the maximum number of tries this could have attempted over the same time, it can be expressed as:
R0 = N · i
In particular, in the static framework we have that the infection could spread the eight adjacent cells, so the probability i has to be multiplied by 8:R0 = 8N · i (2)

Thus, if follows that:
i = R0 8N

– Probability of death d: the mortality rate D can be considered as the cumulative probability of being dead after N days of the disease, it can be applied the same reasoning used to calculate In given i and n (1), changing the number of infected adjacent cells n with the course of the diasease N:
D = 1 − (1 − d)N

It follows that:
d = N√1 − D

– Probability of healing h: considered that the probability of being healed at the end of the course of
the diease is complementary to the probability of being dead, it follows that:
H = 1 − (1 − h)N = 1 − D ⇒ h = 1 − N√1 − H = 1 − N√D (3)
The three probabilities has been calculated for the seven diseases:

Some arbitrary values of the probabilities have then been established, covering with a greater density the low value zone, that is the more interesting one according to the data shown above. To do it properly we ordered the diseases by each probability, obtaining graphs in which the horizontal axis represents a cardinal number associated with the disease and the vertical one represents the corresponding probability, and fitting the medical data with exponential relations.
The described method is shown in the graph 4 for the healing probability, while the overall results for the three probability parameters are summarized in the following table.

Figure 4: The exponential fit used to obtain input values of the healing probability for our simulations.

The simulation

To extract the data needed to do statistics on the evolution of the disease, each model has been run 100 times over a 501 × 501 cells matrix; the odd number chosen is big enough to let the disease spread almost freely for many steps and permits to let it start from the very center of the matrix. For each run are obtained:
– Percentage of not even infected population NI(%);
– Percentage of the population dead of illness D(%);
– Percentage of the population healed from the disease H(%);
– Number of steps (cycles ran) needed for stability N.
The average value and the corresponding standard deviation σ of these four parameters has been calculated for each model.
4 Data analysis, static case
The data obtained from our simulations have then been analysed by using pivot tables. In these graphs the horizontal plane is defined by the three probabilities inserted in input in the simulations, while on the vertical axis is placed one of the four data obtained in output. We decided to focus our attention on two particular parameters: the days of duration of the pandemics N and the mortality rate D. The days of duration of the disease are shown as a function of the three different probabilities in graph 5.

As first thing, it has been noticed that the duration of the disease decreases considerably as the probability of healing increases. It is also possible to find an interesting fit for this relation: indeed, the logarithm of the duration of the disease as a function of the healing probability follows a power law model, for fixed death and infection probabilities:
lnN = a · hb, (4)
where b < 0 and a > 0.
Subsequently the mortality rate was placed in a graph and it has been possible to study its development in connection to the input probabilities.
As in the previous case, a similar result has been noticed: increasing the probability of healing, the mortality rates decreases. With this second graph, it can also be seen that some values of the mortality rate are much smaller than other ones and, for this reason, some areas must be zoomed in to be able to see these values.
Instead, considering the trend of the mortality rate compared to the probability of infection, it is possible to find a particular fit for this relation, indeed, the increasing part of the graph follows a logarithmic model, for fixed death and healing probabilities:
y = a · ln(x + b) + c,
where y = D and x = i.


The data obtained from the simulations were checked with those collected from some real diseases which had similar probabilities: measles and ebola. As measles had slightly different probabilities of death and healing from the one used in the simulations, the two mortality rates obtained are different and in particular the one calculated from the simulation is five time bigger than the measles one, but is still acceptable. Considering ebola, the two mortality rates are more similar but there are some differences in this case too,

due to the same motive as before.

As first result it has been proven that, if the probability of infection is high enough, the ratio between the percentage of healed patients H and the percentage of dead patients D is approximately the same between the probability of healing and the probability of death;

It has been also proven that decreasing the probability of infection this relation is not true any more.
Using the mortality rate graph it has been possible to study the change of this parameter in different situations, in connection with the change of the probabilities and two different kind of results have been obtained. It has been observed that, fixing the probabilities of infection and death and increasing the healing one, the mortality rate has a monotonic decrease.

Instead, if the probability of infection and healing are fixed, it has been proven that the biggest value of the mortality rate doesn’t always correspond to the biggest value of the probability of death: this happens because if the probability of death was very high many patients would die and the disease would not even have the time to spread.

This graph shows different developments of the mortality rate compared to the probabilities of infection. Different colours correspond to different curves: it can be seen that if the probability of infection is a low value, the mortality rate has a monotonic increasing; but as the probability grows, the mortality rate changes its trend and, in particular, has its biggest value when the probability of death is not the biggest. Instead, when the biggest value of the probability of infection is reached, the mortality rate has a monotonic increase, again.

More realistic models

In order to make the simulation more realistic, models describing the effect of vaccinations and movement have finally been added.
5.1 Vaccination and herd immunity
For simulating the effect of a vaccination we can choose a %V (vaccination percentage), inserted in the input data, which describe the percentage of vaccinated people inside the matrix, allocated with a random distribution. To introduce the vaccines in the simulation we considered the cell representing the person vaccinated with the number 4 and to represent it in the graphic interface it was colored with the blue color. Therefore, the vaccinated cell can not be contagious and at the end of the run all the vaccinated people were considered healthy. Before we started the simulation, we focused on the Herd Immunity Threshold (HIT), that is the percentage of the population that must be vaccinated so that the disease does not spread:

Starting with the theoretical estimation, we extracted N solving the following system by means of nu- merical methods: Considering the great number of uncertainties inherent in the simulated framework, the two estimated percentages can be considered compatible with each other and with the value reported in the medical literature (92 − 95%).

People movement and transmission ways

In order to simulate a more realistic behaviour, the algorithm has been modified to include the movement of people and different kinds of transmission modes. To obtain this new algorithm we had to add a new value to the matrix: −1. This value corresponds to the empty cell, necessary to allow the movement of people.
As mentioned before, we found it useful to define an additional parameter: the range of movement. This parameter determines the maximum number of cells that a person can travel in one step. The new position is determined by a random number for the x axis and a random number for the y axis ranging from – movement radius to + movement radius.
The movement rays we have defined are:
– static;
– walking (max 100 cells per axis);
– bike (max 200 cells per axis);
– car (max 400 cells per axis).
The random is an integer generated between the values just described. Being in the random the possibility of obtaining the value 0, people have the possibility to remain still. Another condition that we used when we decided to implement the movement is the fact that each person can not move on to another.
The other parameter added is the radius of infection: in this way it is not enough to check only the 8 adjacent cells to determine the probability of being infected, but the (N ·2 + 1)2 − 1 cells around the person forming a new control square.

For example, if the radius of infection is 3, as we can see in the image, the controlled cells are the 48 around. In this case the infected people in the delimited region are 5, so the probability of the central cell to be infected corresponds to:
In = 1 − (1 − i)5 We defined different rays of infection:
– direct Infection (r=1 cells);
– droplets (r=5 cells);
– air (r=10 cells);
– vector (r=20 cells).
After defining the movement, we proceeded to collect the data to find the HIT in a more realistic model. Using again our measles-like disease, vaccination rates that focused around the hypothetical HIT (from 80% to 99%), the maximum range of motion (Car) and the biggest radius of infection (Vector) we obtained this graph, representing the number of uninfected people compared to the vaccination rate:
It is found that HIT in the static model is quite similar to the one evaluated in this model. In fact, the HIT calculated from this model, as can be seen from the graph, is equivalent to 92%.


In this work, a numerical and statistical approach has been used to simulate the spread of various diseases on a two-dimensional matrix. Using C++ and Java languages, various algorithms have been developed to simulate the behaviour on a static framework first, and then extend the study to a more general case including vaccinations, people movement, and different transmission ways.
The problem statement required to consider three different probabilities: infection, healing and death ones. We decided to chose their values by comparison with data related to real diseases and checked that our simulation provided results in accordance with what was expected for the same epidemics.
Regarding the obtained results it is also essential to highlight that, while some output distributions (6) can be fitted with certain relations (4), on the other hand a precise mathematical law describing the spread of the disease can’t easily be defined in our two-dimensional model. This is shown for example in graph 9, where it can be noticed that the number of dead people isn’t always associated with the maximum death probability, but it is determined by conditions of unstable equilibrium in which the infectivity of the disease plays a decisive role too.
In the last part of this work, a more general model including vaccinations, people movement, and dif- ferent transmission ways, has finally been developed. Vaccinations have been firstly applied on the static framework, estimating the Herd Immunity Threshold for a measles-like disease both in fully numerical way and semi-theoretically, and verifying that the value obtained was compatible with that reported in the med- ical literature. Finally a fully dynamical framework was implemented, including various forms of people movement and transmission ways, verifying in this case too that the HIT obtained was compatible with the values previously estimated and reported in the medical literature.


Pandemic Simulation Research Authors