1
A Robust Stochastic Method of Estimating the
Transmission Potential of 2019-nCoV
arXiv:2002.03828v1 [q-bio.PE] 7 Feb 2020
Jun Li
University of Technology Sydney, Broadway 123, NSW 2007
Abstract—The recent outbreak of a novel coronavirus (2019nCoV) has quickly evolved into a global health crisis. The
transmission potential of 2019-nCoV has been modelled and
studied in several recent research works. The key factors such
as the basic reproductive number, R0 , of the virus have been
identified by fitting contagious disease spreading models to
aggregated data. The data include the reported cases both within
China and in closely connected cities over the world.
In this paper, we study the transmission potential of 2019nCoV from the perspective of the robustness of the statistical
estimation, in light of varying data quality and timeliness in the
initial stage of the outbreak. Sample consensus algorithm has
been adopted to improve model fitting when outliers are present.
The robust estimation enables us to identify two clusters of
transmission models, both are of substantial concern, one with
R0 : 8 ∼ 14, comparable to that of measles and the other dictates
a large initial infected group.
Highlights
•
•
•
•
•
We introduce robust transmission model fitting. We employed random sample consensus algorithm for the fitting
of a susceptible-exposed-infectious-recovered (SEIR) infection model.
We identify data consistency issues and raise flags for
i) a potentially high-infectious epidemic and ii) further
investigation of records with unexplained statistical characteristics.
This analysis accounts for the spreading in 80+ China
cities with multi-million individual populations, which
are connected to the original outbreak location (Wuhan)
during the massive people transportation period (chunyun)1 .
As the virus is active and the analytics and control of the
epidemic is an urgent endeavour, we choose to release
all source code and implementation details despite
the research is on-going. The scientific ramification
is that conclusions may need further revision with
richer and better prepared data made available.
We have published our implementation on Github
All
procedures are included in a single Python notebook.
We have only used publicly available data in the research,
which have been also made available with the project.
1
– Traffic is considered in [8], but for the purpose of modelling the
population variation within Wuhan, the outbreak site.
The quality and reliability of estimation could be further improved by adopting richer data from commercial
sources or authorities. More discussion in this regard can
be found in the conclusion section.
I. I NTRODUCTION
Since December 2019, a new strain of coronavirus (2019nCoV) has started spreading in Wuhan, Hubei Province, China
[8]. The initial cases of infection have suspicious exposure to
wild animals. However, when cases are reported in globally
in middle January 2020, including Southeast and East Asia
as well as the United States and Australia, the virus shows
sustained human-to-human transmission (On 21 January 2020,
the WHO suggested there was possible sustained human-tohuman transmission). With the massive people transport prior
to Chinese New Year (Chunyun), the virus spreads to major
cities in China and densely populated cities within Hubei
Province.
There are a number of epidemiological analysis on the
transmission potential of 2019-nCoV. Read et al. [6] fit a
susceptible-exposed-infectious-recovered (SEIR) metapopulation infection model to reported cases in Wuhan and major
cities connected by air traffic. In [8], an SEIR model has
been estimated by including surface traffic from location-based
services data of Tencent. However, neither the air traffic to
international destinations nor the aggregated people throughput
to Wuhan can help establish the transmission model among
populous China cities connected to Wuhan mainly via surface
traffic. Significantly, the reported cases in those populations
connected to Wuhan are important to help robust estimation
of the transmission potential of the virus. This is particularly
important in the initial stage of the outbreak, as the initial
reports can be prone to various disturbances, such as to delay
or misdiagnosis, which is identified in our robust analysis
below.
In this work, we present a study on robust methods of fitting
the infection models to empirical data. We propose to employ
the random sample consensus (RANSAC) algorithm [3] to
achieve robust parameter estimation. SEIR and most infection
models of contagious diseases are designed for review analysis
[2]. On the other hand, to provide a useful forecast in the outbreaking stage of a new disease, transmission models must
be established using data that are insufficient in terms of
both quantity and quality. The maximum likelihood model
estimation used by most existing studies is sensitive to outliers.
2
Therefore, the estimated parameters can be unreliable due to
the quality of the data in the initial stage of an epidemic.
The issue is rooted in the combination of the quality of the
data and sensitivity of the fitting method, therefore it is not
easily addressed/captured by traditional sensitivity analysis
techniques such as bootstrapping.
Random sample consensus algorithm alleviates the predominant influence on the model fitting of the records of
infections in the original place, Wuhan, and close-by cities.
The selected model reveals different statistical characteristics
in the spreading of the virus in different cities, according to
the local records, which deserves further investigation.
By identifying and accounting for a large volume of records
of uncertain timeliness and accuracy, we have identified two
candidate groups of models that agree with empirical records.
One with significantly higher R0 , at the level of measles, and
the other model cluster has R0 similar to previously reported
values [8], [6] but suggests there were already a large number
of infected individuals on 1 January 2020.
II. M ETHOD
A. Data Source
This research follows a similar procedure of acquiring and
processing data of confirmed cases and public transportation
as in [8]. The infection report is summarised daily by Pengpai
News[5], who collects reports from the Health Commissions
of local administrations of different provinces and cities. We
include the major populated areas with strong connections with
Wuhan in this study. We selected the locations which i) have
a population greater than 3 million ii) are among the top100 destinations for travellers departing from Wuhan on 22
January (the day before the lockdown of the city for quarantine
purposes. We include 84 cities, including Wuhan, in this study.
We collect data of population from various sources on
the World Wide Web. The transportation data is from Baidu
migaration index [1], based on their record of location-based
services. We estimated the absolute number of travellers by
aligning the index of a reported number of 4.09M during the
period of 10-20 January 2020.
In the data collection, infections outside China are summarised at the country level and the specific cities are missing.
We exclude this part of infection records since entire countries
have a different distribution of population than individual
populated areas. Such evidence can be considered in future
research by employing more geographical/demographical data
as well as volumes of traffic connections.
B. Transmission Model and Ftting to Data
nent corresponding to people movement between populated
areas. The transmission model is defined as follows
dSj (t)
= −β
dt
dEj (t)
=β
dt
Kc,j (t)
Ic + Ij
nc
c
c
Kc,j (t)
Ic + Ij
nc
·
Sj (t)
nj
Sj (t)
− αEj (t)
nj
(1)
(2)
dIj (t)
= αEj (t) − γIj (t)
(3)
dt
dRj (t)
= γIj (t)
(4)
dt
where S, E, I, R represent the number of susceptible, exposed,
infected and recovered (non-infectable) subjects. Equation set
(1-4) specify the dynamics of the disease spreading in a set of
populated areas connected by a traffic network. The subscript
j is over the areas, e.g. cities.
Spreading dynamics: The model parameters α, β, γ control
the dynamics of the disease spreading. In a unit of time,
exposed subjects become infected with a rate of α. Thus the
mean latent (incubation) period is 1/α, which were ranging
from 3.8-9 in previous epidemiological studies of CoV’s [7],
[4]. We use α = 1/7 according to empirical observation
as of Feb 2020. The model and the fitting process is not
hypersensitive to this parameter [6]. Parameter β represents the
rate of conversion from the status of “exposed” to “infected”
in one time unit. Parameter γ determines the rate of recovery,
while the recovered subjects are removed from the repository
of susceptible subjects. The parameters β and γ are estimated
by fitting the model to data using a stochastic searching
strategy, as discussed below.
Transportation dynamics: Between-area dynamics is specified by a traffic model, which entails a set of connectivity
matrices K(t), where an entry Ki,j (t) is the number of
travellers from area-i to area-j at time t. The transportation
K (t)
model dictates that at time t, c c,j
nc Ic infected subjects
arrive at area-j and start infecting susceptible subject in the
destination area-j.
Initial infections: At t = 0, which is set to 1 January 2020
in this study, the number of infected cases at Wuhan is set
to a seeding number IW (0). IW (0) is a parameter inferred
from data as in [6]. Alternatively, a zoonotic infection model
is used in [8], considering the evidence of an animal origin of
the2019-nCoV.
2) Model Fitting via Maximum Likelihood and Challenges:
There are three parameters to specify in the metapopulation
SEIR model, denoted by a vector θ: (β, γ, IW (0)). Most
existing studies adopt the maximum likelihood method to
infer model parameters from empirical data. The inference
is an optimisation process, with the objective defined as the
probability of observing the empirical data given the model
predictions, e.g.
θ ∗ := arg min
θ
1) SEIR metapopulation infection model: In this research,
we adopt the susceptible-exposed-infectious-recovered (SEIR)
model of the development and infection process of 2019-nCoV,
similar to that in [6]. The model includes a dynamic compo-
·
− log P (xt |SEIR(t; θ))
(5)
t
where P (x|µ) represents the probability density/mass of
observing x given model prediction µ. The probability is
accumulated over time t. Note that we use boldface symbols to
3
indicate that both observed data x and model prediction µ can
be vectors containing the information of the disease at multiple
locations. Theoretically, the inference optimisation in (5) can
be established by using any observation model. However,
in practice, to estimate the transmission characteristics of a
contagious disease during the out-breaking stage, the empirical
observations are usually limited to the sporadic report of
confirmed infection cases, as the exposed latent subjects are
unable to identify and waiting for recovery cases is not a viable
option for nowcasting and forecasting study.
Relying on confirmed infections can make model parameter
estimation difficult. On one hand, the initial observations are
often of suboptimal quality in terms of both timeliness and
accuracy. As a new disease starts spreading, the first cases
can be misdiagnosed, especially when the symptoms are mild
in a significant portion of infectious subjects/period. On the
other hand, the negative log-likelihood objective function is
usually dominated by the observations in the original location,
where the disease starts spreading. Therefore, it is possible
that significantly disturbed observations in the original location
lead to biased estimation of the model. The systematic bias is
not easily dealt with by traditionally statistical techniques such
as boot-strapping.
3) RANSAC Algorithm of Robust Model Fitting: The random sample consensus (RANSAC) method is designed for
model estimation with a significant amount of outliers in
data. The essential idea is to fit a simple model (3 adjustable
parameters in the SEIR model) using the minimum number
of data points randomly drawn from the dataset. Algorithm 1.
The following Algorithm 1 shows the steps of the algorithm.
Algorithm 1: RANSAC Algorithm of Fitting SEIR Model
to Infection Data
Input: Rounds of random sampling, nR and number of
random samples in each round of model fitting,
ns
Input: Daily records of infectons of T days and nL
locations, X : [nL × T ]
Input: Model fitting function:
f : {x1 , . . . , xns } → (β, γ, IW (0))
Input: Inlier Counting: g : (β, γ, IW (0)), X → nIn
∗
Result: Optimal parameters: β ∗ , γ ∗ , IW
(0)
∗
1 Initialise nIn ← −1
2 for i ← 1 to nR do
3
Randomly draw li from {1, . . . , L}
4
Randomly draw ns samples from X[li , . . . ]:
{xi1 , . . . , xins }
5
β, γ, IW (0) ← f (xi1 , . . . , xins )
6
nIn ← g((β, γ, IW (0)), X)
7
if nIn > n∗In then
8
n∗In ← nIn
∗
9
β ∗ , γ ∗ , IW
(0) ← β, γ, IW (0)
10
end
11 end
In the algorithm, the steps from line 7 to line 9 choose
the model achieving maximum consensus among the random
samples. The function f executes the maximum likelihood
model fitting. However, the optimisation has been made
straightforward, as there are only ns daily infection data points
from one location li to fit to. We choose ns = 4 in this study
to determine the 3 parameters of the SEIR model. So there
are 4 constraints and 3 degrees of freedom, where the one
extra constraint helps stabilise the optimisation. The function
g counts inliers in the whole data for a given SEIR model. To
be considered as an inlier, a recorded infection number at time
t in place l needs to fall within the 5% to 95% CI of the model
prediction at the time and location. Following [6], we use the
Poisson distribution to approximate the probability distribution
of the infection number within one day in a location.
III. E STIMATION AND P REDICTION OF E PIDEMIC S IZE
A. Parameters of SEIR Transmission Model
Due to the size of the populations and the short period of
interest, we can ignore the change of the population due to
birth or death during the process. Thus the basic reproductive
number in this SEIR model can be estimated as R0 ≈ βγ .
Figure 1 shows the model parameters fitted to the minimum
(ns = 4) random samples in 1,000 RANSAC iterations. In the
figure, the models are specified by a pair of parameters: the
basic reproductive R0 and the estimated infection number in
Wuhan on 1 January 2020, IW (0). The numbers of inliers in
the last 5 days in the recorded period (up to 5 Feb 2020) is
considered as the fitness of the corresponding models. Fitness
is indicated by the colour in the figure. The model producing
the greatest number of inliers is marked by a triangle in the
figure.
In Figure 1, as far as the available data is concerned, there
is a structure of two main clusters indicating candidates of
valid models. Intuitively, one cluster ("1") corresponds to the
possibility of a highly infectious virus starting from a relatively
small group of subjects. The other cluster ("2") indicates an
R0 that is more consistent with existing estimations, but the
virus has started from a large number of individuals, which
is vastly exceeding the current expectation. The parameter set
leading to the greatest fitness in the RANSAC process is from
cluster-2,
β ∗ = 0.642
γ ∗ = 0.135
R0∗ = 4.76
∗
IW
(0) ≈ 641
which has 256 out of 425 daily infection number (from 85
places in the last 5 days) falling within the inlier-zone.
It is too early to rule out either or both possibilities. It
has become evidential that the virus can show mild or no
symptoms in a significant portion of infections. Plus the fact
that the virus was unknown to human, it was not impossible
that the virus had been circulating for a period, even with
sporadic severe cases being misdiagnosed for other diseases,
before a group of severe infection eventually broke and called
attention.
4
Basic
Reproductive
Number
R0
成都市
250
30
Number
of
Inliers
(Recent
5D)
25
240
220
200
20
Chengdu
200
180
160
15
140
120
150
100
10
80
60
5
40
20
100
0
0
1
2
5
10
2
5
100
2
5
1000
2
5
10k
Infections
on
1
Jan
2020
50
(a)
0
14
Jan
12
2020
12
Jan
19
Jan
26
Feb
2
Feb
9
Fig. 2. Simulation and forecasting of infections in a major China city,
compared with reported cases. The bold red curve represents the predicted
infection number by running simulation using the SEIR model selected by
the RANSAC algorithm. The markers correspond to accumulated infection
numbers up to the dates. Triangles represent the newly reported infections of
the corresponding days are classified as outliers given the predicted Poisson
distributions. Red up-triangles
represent the recorded value exceeds the
upper bound of the CI (infection number is too high according to the model).
Green down-triangles represent the opposite cases. Blue circles • represent
inliers.
10
8
6
80
100
120
140
160
180
200
Infections
on
1
Jan
2020
(b)
Fig. 1. SEIR model parameter estimation using RANSAC
middle west provinces. However, the spreading rate is greater
than the expectation in cities connected to Wuhan closely.
On the other hand, for satellite cities with closest connections with Wuhan the recorded infection cases are significantly
lower than expected. For Wuhan herself, the record is lower
than what has been expected, in terms of several orders of
magnitude. We will discuss possible explanations in the next
section.
B. Simulation of Infection in Metapopulation
We have built metapopulation SEIR model using the param∗
eters β ∗ , γ ∗ and IW
(0) selected by the RANSAC algorithm
above. We then run simulations using the fitted SEIR model
and compare the model prediction with empirical data of
infection recorded in different cities over China. Figure 2
shows the simulation result and the accumulated infection
data for one major China city Chengdu. The model simulation
has explained the newly identified infections in a significant
number of days during the period of interest. See figure caption
for detailed interpretation of the curves and marks in the plots.
Simulation results for 80+ major China cities of strong
connections with Wuhan are available in the figures (Figure
3-7) at the end of this document. The simulation results suggest the spreading of 2019-nCoV in China megapolitans (e.g.
Beijing, Shanghai, Guangzhou and Shenzhen) is exceeding the
expectation of the overall SEIR model. The model simulation
matches the observation in a range of large China cities, such
as the capital cities, Shijiazhuang, Zhengzhou and Xi’an of the
IV. C ONCLUSION , L IMITS AND F UTURE R ESEARCH
In this study, we adopt a robust model fitting method,
random sample consensus, which has enabled us to establish stable SEIR model families and identify outliers in the
infection data of 2019-nCoV. The random sample consensus
is made possible by employing traffic network dynamics in
the SEIR model to handle the infection in cities connected to
Wuhan.
A. Improve Data Quality
Domestic and international airline traffic: We did not include
international cities and air-traffic in the current analysis. One
reason is that our focus is on the China populous cities, while
the volume of travellers by train vastly exceeds that by air.
The airline data can be added in future research.
Traffic networks: The current transportation matrices K ’s
have only one row of values corresponding to the traveller’s
5
departing Wuhan. This would not be a major issue in the period when the first generation of human-to-human transmission
is our main concern. The inter-city traffic would play a more
significant role in the spreading of the virus after cities other
than Wuhan had accumulated an infected population.
Early infection data: a phenomenon demanding explanation
is that: the SEIR has failed to capture the variations of the
infection data within Wuhan and nearby cities. What is fairly
surprising is that the SEIR model overestimated the infection
numbers. This is counter-intuitive because it is those cities
that are mostly affected by the virus and have a large number
of infections. This could be the result of poor data quality,
or the spreading mode has changed in different stages of the
spreading.
B. Modelling Tools
We used SEIR model to represent the characteristics of
the infection data. The model is effective and simple to fit,
thanks to the simplicity of the parameter structure in the
model (3 only). On the other hand, ODE based modelling is
simultaneously stiff and sensitive. Modern end-to-end learning
based models can be considered in future research.
R EFERENCES
[1] Baidu, 2020. qianxi.baidu.com.
[2] Gerardo Chowell, James M. Hyman, Lu`ıs M. A. Bettencourt, and Carlos
Castillo-Chavez, editors. Mathematical and Statistical Estimation Approaches in Epidemiology. Springer, 2009.
[3] Martin A. Fischler and Robert C. Bolles. Random Sample Consensus:
A Paradigm for Model Fitting with Applications to Image Analysis and
Automated Cartography. Comm. ACM, 24(6), 1981.
[4] Gabriel M. Leung, Anthony J. Hedley, Lai-Ming Ho, Patsy Chau,
Irene O.L. Wong, Thuan Q. Thach, Azra C. Ghani, Christl A. Donnelly,
Christophe Fraser, Steven Riley, Neil M. Ferguson, Roy M. Anderson,
Thomas Tsang, Pak-Yin Leung, Vivian Wong, Jane C.K. Chan, Eva
Tsui, Su-Vui Lo, and Tai-Hing Lam. The Epidemiology of Severe Acute
Respiratory Syndrome in the 2003 Hong Kong Epidemic: An Analysis
of All 1755 Patients. Annals of Internal Medicine, 141, 2004.
[5] Pengpai News, 2020. www.thepaper.cn.
[6] Jonathan M Read, Jessica RE Bridgen, Derek AT Cummings, Antonia
Ho, and Chris P Jewell. Novel coronavirus 2019-ncov: early estimation
of epidemiological parameters and epidemic predictions. medRxiv, 2020.
[7] Victor Virlogeux, Vicky J. Fang, Minah Park, Joseph T. Wu, and Benjamin J. Cowling. Comparison of incubation period distribution of human
infections with MERS-CoV in South Korea and Saudi Arabia. Scientific
Reports, 6(35839), 2016.
[8] Joseph T Wu, Kathy Leung, and Gabriel M Leung. Nowcasting and
forecasting the potential domestic and international spread of the 2019nCoV outbreak originating in Wuhan, China: a modelling study. Lancet,
2020.
6
重庆市
800
上海市
250
Chongqing
300
Shanghai
200
北京市
350
成都市
250
Beijing
Chengdu
200
250
600
400
150
200
150
100
150
100
100
200
50
0
0
Jan
12
2020
70
Jan
19
Jan
26
Feb
2
Feb
9
天津市
Jan
19
Jan
26
Feb
2
Feb
9
广州市
30
20
10
0
600
Jan
19
Jan
26
Feb
2
500
50
50
0.2M
0
Jan
19
Jan
26
Feb
2
Feb
9
临沂市
60
200
20
40
100
10
20
0
Jan
12
2020
80
70
Jan
19
Jan
26
Feb
2
Feb
9
苏州市
60
70
50
40
40
30
30
20
20
10
10
0
Jan
19
Jan
26
Feb
2
Feb
9
保定市
100
Jan
19
Jan
26
Feb
2
Feb
9
邯郸市
80
40
20
0
Jan
19
Jan
26
Feb
2
Feb
9
石家庄市
Jan
12
2020
Jan
19
Jan
26
Feb
9
Feb
2
Feb
9
Feb
2
Feb
9
哈尔滨市
Harbin
80
20
0
Jan
12
2020
Jan
19
Jan
26
Feb
2
Feb
9
郑州市
Zhengzhou
Jan
12
2020
Jan
19
Jan
26
西安市
Xi'an
150
100
50
50
0
Jan
12
2020
Jan
19
Jan
26
Feb
2
Feb
9
温州市
0
Jan
12
2020
350
Wenzhou
Jan
19
Jan
26
Feb
2
Feb
9
周口市
Jan
12
2020
140
Zhoukou
300
250
100
200
200
80
150
150
60
100
100
40
50
50
20
0
Jan
12
2020
Jan
19
Jan
26
Feb
2
Feb
9
Jan
19
Jan
26
杭州市
Hangzhou
120
250
0
Jan
12
2020
Feb
9
100
300
60
Feb
2
150
350
Handan
Feb
2
武汉市
40
200
0
Jan
12
2020
Feb
9
60
250
Baoding
60
50
Jan
26
0
Jan
12
2020
80
Suzhou
Jan
19
Shijiazhuang
100
30
Feb
2
0
Jan
12
2020
120
Linyi
Jan
26
0.8M
0.4M
Jan
12
2020
Jan
19
Wuhan
1M
0.6M
80
0
1.2M
100
40
300
深圳市
Jan
12
2020
100
50
400
Feb
9
150
60
Nanyang
Feb
2
150
Feb
9
南阳市
Jan
26
200
0
Jan
12
2020
Jan
19
Shenzhen
250
200
40
Jan
12
2020
300
Guangzhou
250
50
0
0
Jan
12
2020
300
Tianjin
60
50
50
0
Jan
12
2020
Jan
19
Jan
26
Feb
2
Feb
9
Jan
12
2020
Jan
19
Jan
26
Feb
2
Feb
9
Fig. 3. Simulation and forecasting of infections in major China cities and comparison to accumulated cases. See Figure 2 for detailed interpretation of the
marks and legends used in the plots.
7
徐州市
100
Xuzhou
80
赣州市
120
Ganzhou
100
40
40
20
20
20
0
0
0
140
Jan
19
Jan
26
Feb
2
Feb
9
泉州市
100
Quanzhou
120
Jan
12
2020
100
80
60
40
20
0
Jan
19
Jan
26
Feb
2
Feb
9
南京市
80
70
Jan
19
Jan
26
Feb
2
60
50
50
0
Jan
12
2020
Jan
19
Jan
26
Feb
2
Feb
9
盐城市
500
Yancheng
20
20
10
10
140
20
10
0
Jan
12
2020
500
Jan
19
Jan
26
Feb
2
400
Feb
2
Feb
9
福州市
30
40
20
20
10
0
0
Jan
12
2020
60
Jan
19
Jan
26
Feb
2
Feb
9
湛江市
Zhanjiang
30
Jan
19
Jan
26
Feb
2
Feb
9
Jan
26
Feb
2
Feb
9
衡阳市
Jan
19
Jan
26
Feb
2
Feb
9
邢台市
Jan
12
2020
180
160
Xingtai
Jan
19
Jan
26
Feb
2
Feb
9
邵阳市
Shaoyang
140
120
100
80
60
40
20
0
Jan
12
2020
Jan
19
Jan
26
Feb
2
Feb
9
南宁市
Jan
12
2020
12k
Nanning
Jan
19
Jan
26
Feb
2
Feb
9
黄冈市
Huanggang
10k
8k
6k
4k
2k
0
0
Jan
12
2020
Jan
19
Hengyang
20
10
0
Jan
12
2020
200
40
20
100
驻马店市
60
40
200
Feb
9
0
80
50
300
商丘市
50
60
60
Feb
9
100
70
40
Feb
2
150
80
80
Jan
26
Shangqiu
Zhumadian
Jan
12
2020
50
70
Changsha
Jan
26
100
Feb
9
长沙市
Jan
19
Fuzhou
120
30
Feb
2
0
Jan
12
2020
160
40
Jan
26
100
0
Feb
9
Jan
19
400
200
Jan
19
0
Jan
12
2020
30
沧州市
250
Fuyang
20
30
Cangzhou
阜阳市
Jan
12
2020
100
300
50
Feb
9
100
40
60
Feb
2
40
60
Feb
2
Jan
26
150
40
Jan
26
Jan
19
150
50
Jan
19
0
Jan
12
2020
50
Jan
12
2020
10
60
70
0
20
200
80
Nantong
30
200
Feb
9
南通市
40
80
0
Jan
12
2020
50
250
Nanjing
Dongguan
60
60
40
Jan
12
2020
70
80
60
东莞市
80
Heze
100
80
60
菏泽市
120
Jan
12
2020
Jan
19
Jan
26
Feb
2
Feb
9
0
Jan
12
2020
Jan
19
Jan
26
Feb
2
Feb
9
Jan
12
2020
Jan
19
Jan
26
Feb
2
Feb
9
Fig. 4. Simulation and forecasting of infections in major China cities and comparison to accumulated cases. See Figure 2 for detailed interpretation of the
marks and legends used in the plots.
8
南充市
60
Nanyun
50
洛阳市
120
120
Luoyang
100
80
60
40
40
20
20
40
10
20
20
0
0
0
Jan
12
2020
60
Jan
19
Jan
26
Feb
2
Feb
9
无锡市
1400
Wuxi
50
Jan
12
2020
1000
30
800
Jan
26
Feb
2
Feb
9
信阳市
0
Jan
12
2020
120
Jan
19
Jan
26
Feb
2
新乡市
100
200
0
0
Jan
12
2020
Jan
19
Jan
26
Feb
2
Feb
9
合肥市
台州市
Hefei
350
300
40
20
0
Jan
12
2020
4000
3500
Jan
19
Jan
26
Feb
2
Feb
9
Jan
19
Jan
26
Feb
2
Feb
9
荆州市
襄阳市
500
Xiangyang
3000
Feb
2
Feb
9
140
Jingzhou
岳阳市
Feb
9
Feb
2
Feb
9
Feb
2
Feb
9
40
20
0
Jan
12
2020
Jan
19
Jan
26
Feb
2
Feb
9
达州市
Jan
12
2020
200
Dazhou
Jan
19
Jan
26
宜春市
Yichun
150
50
100
40
30
50
20
100
500
Feb
2
六安市
Liuan
120
60
200
1000
Feb
9
60
70
2000
1500
Jan
26
80
80
300
Jan
19
100
90
Yueyang
400
2500
Jan
26
Jan
12
2020
160
0
Jan
19
Feb
2
常德市
Changde
0
Jan
12
2020
1000
Jan
12
2020
Feb
9
50
2000
0
Feb
2
100
3000
50
Jan
26
150
5000
100
Jan
19
200
4000
60
Jan
12
2020
250
6000
150
80
Feb
9
40
200
Xinxiang
Feb
2
60
20
Feb
9
Jan
26
80
400
10
Jan
19
Taizhou
100
600
20
0
Jan
12
2020
120
Xinyang
1200
40
Jan
19
100
60
60
Kunming
120
80
30
昆明市
140
Shangrao
100
80
40
上饶市
140
10
0
0
Jan
12
2020
80
70
Jan
19
Jan
26
Feb
2
Feb
9
宿州市
450
400
Suzhou
60
Jan
19
Jan
26
Feb
2
Feb
9
安庆市
Anqing
350
Jan
26
Feb
2
Feb
9
永州市
100
Yongzhou
40
100
10
20
Jan
19
Jan
26
Feb
2
Feb
9
安阳市
Anyang
20
0
0
Jan
12
2020
Jan
26
40
50
0
Jan
19
80
150
20
Jan
12
2020
60
200
30
Jan
19
60
250
40
Jan
12
2020
80
300
50
0
0
Jan
12
2020
Jan
12
2020
Jan
19
Jan
26
Feb
2
Feb
9
0
Jan
12
2020
Jan
19
Jan
26
Feb
2
Feb
9
Jan
12
2020
Jan
19
Jan
26
Fig. 5. Simulation and forecasting of infections in major China cities and comparison to accumulated cases. See Figure 2 for detailed interpretation of the
marks and legends used in the plots.
9
南昌市
300
Nanchang
250
平顶山市
120
70
Pingdingshan
100
60
150
100
40
50
20
0
100
Jan
19
Jan
26
Feb
2
Feb
9
吉安市
Ji'an
8k
40
6k
4k
2k
10
0
0
Jan
19
Jan
26
Feb
2
Feb
9
桂林市
Guilin
80
80
50
20
Jan
12
2020
Jan
12
2020
60
40
40
20
20
0
0
Jan
19
Jan
26
Feb
2
Feb
9
怀化市
400
Jan
19
Jan
26
Feb
2
Feb
9
Feb
2
Feb
9
Feb
2
Feb
9
Feb
2
Feb
9
Feb
2
Feb
9
九江市
Jiujiang
350
300
250
60
40
Jan
12
2020
450
Huaihua
80
60
Xiaogan
10k
30
0
Jan
12
2020
孝感市
12k
Haozhou
60
80
200
亳州市
80
200
150
20
100
50
0
Jan
12
2020
120
Jan
19
Jan
26
Feb
2
Feb
9
开封市
80
70
Kaifeng
100
Jan
12
2020
Jan
26
Feb
2
Feb
9
泰州市
Taizhou
60
80
Jan
19
Feb
2
Feb
9
惠州市
20
60
Jan
19
Jan
26
Feb
2
Feb
9
扬州市
200
Yangzhou
50
Jan
19
Jan
26
Feb
2
Feb
9
益阳市
20
Jan
19
Jan
26
Feb
2
Feb
9
许昌市
100
10
500
20
0
0
Jan
12
2020
120
Jan
19
Jan
26
Feb
2
Feb
9
抚州市
140
Fuzhou
100
0
Jan
12
2020
60
40
20
0
Jan
26
Feb
2
Feb
9
株洲市
100
80
80
60
60
40
40
20
20
Jan
19
Jan
26
Feb
2
Feb
9
Jan
19
Jan
26
Feb
2
Feb
9
娄底市
Jan
19
Jan
26
Feb
2
Feb
9
Jan
19
Jan
26
湘潭市
Xiangtan
80
60
40
20
0
Jan
12
2020
Jan
12
2020
100
Loudi
120
100
0
Jan
12
2020
0
Jan
12
2020
140
Zhuzhou
120
80
Jan
19
宜昌市
1000
40
50
Jan
26
1500
60
20
Jan
19
Yichang
2000
80
30
Jan
12
2020
2500
Xuchang
100
150
40
0
Jan
12
2020
120
Yiyang
郴州市
40
0
Jan
12
2020
Jan
26
60
20
0
Jan
12
2020
Jan
19
Binzhou
80
10
0
Jan
12
2020
100
40
30
20
Jan
26
60
40
40
Jan
19
Huizhou
80
50
60
0
Jan
12
2020
0
Jan
12
2020
Jan
19
Jan
26
Feb
2
Feb
9
Jan
12
2020
Jan
19
Jan
26
Fig. 6. Simulation and forecasting of infections in major China cities and comparison to accumulated cases. See Figure 2 for detailed interpretation of the
marks and legends used in the plots.
10
濮阳市
80
70
60
Puyang
60
焦作市
70
Jiaozuo
40
30
30
10
10
0
0
Jan
12
2020
Jan
19
Jan
26
Feb
2
Feb
9
Shiyan
1400
1200
1000
800
40
600
20
20
1600
60
40
十堰市
1800
Xiamen
80
50
50
厦门市
400
20
200
0
Jan
12
2020
Jan
19
Jan
26
Feb
2
Feb
9
恩施州
1400
0
Jan
12
2020
Jan
19
Feb
2
Feb
9
Jan
26
Feb
2
Feb
9
Jan
12
2020
Jan
19
Jan
26
Feb
2
Feb
9
Enshi
1200
1000
800
600
400
200
0
Jan
12
2020
Jan
19
Jan
26
Fig. 7. Simulation and forecasting of infections in major China cities and comparison to accumulated cases. See Figure 2 for detailed interpretation of the
marks and legends used in the plots.