Summary

The notebook

Hi, this notebook page was written to analyse the 2019-2020 pandemic outbreak in Europe of the SARS-CoV-2 virus.

It was written to:

  • provide the current status of the outbreak from the authorities;
  • present these status with interactive charts;
  • write a small library to analyse and predict the plateau and duration of the disease;
  • add content to my own personal website with something new;

The disease

The disease COVID-19 is caused by the virus SARS-CoV-2 (Severe Accute Respiratory Syndrome 2).

Your help

This notebook served me to practice some older skills and eventually reach out to an interested audience.
In case you have a suggestion on how to improve the usefulness of the notebook, I will be thankful ( nuno.aja@gmail.com ).

Kind regards, The author.

NOTICE: ongoing work


Section I

In this section we focus on getting the current datasets, process and load them.

For convenience and security we implemented the adequate code to present the data.


In [1]:
import modules_loader
from modules.analytics.sars_cov2_2019_20 import analytics
In [2]:
analysis = analytics.scenario()
available_countries = analysis.download_source_datafiles(force = True)
Matplotlib created a temporary config/cache directory at /tmp/matplotlib-9xqax9d0 because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
In [3]:
target_cases = analysis.process_dataset(available_countries)
Preparing dataset for 192 coutries.

Section II

In this section we focus on the current data exploration.


Selecting the (20) countries with most active cases

We now select countries by active cases number and plot the resulting data on a bar chart.

Our function plots the data automatically whilst saving the chart as well.

An equivalent code would be the following:

top_20_countries_active_cases = analysis.statistics_by_country.sort_values(by='Active', ascending=False).head(20)
top_20_countries_active_cases.iplot(kind='bar', subplots=False, title='The (20) countries with most active cases');
In [4]:
top_20_countries_active_cases = analysis.show_top_cases(20, feature = 'Active')

The 20 countries with most active cases

Following with the data for these selected countries.

In [5]:
display(analysis.statistics_by_province.loc[list(top_20_countries_active_cases.index)].sort_values(by='Active', ascending=False))
Active Infected Recovered Deaths
Country Province
US 32468226 33056765 0 588539
France 5431304 5863138 324444 107390
United Kingdom 4327520 4455221 0 127701
Spain 3401684 3631661 150376 79601
India 3027925 26031991 22712735 291331
Netherlands 1597994 1615500 0 17506
Brazil 1369911 15894094 14080089 444094
Sweden 1040822 1055173 0 14351
Belgium 1016912 1041706 0 24794
Serbia 702139 708878 0 6739
Canada Ontario 516832 525365 0 8533
Iran 436025 2804632 2290613 77994
Switzerland 357792 686152 317600 10760
Canada Quebec 354576 365642 0 11066
Argentina 339211 3447044 3035134 72699
Italy 299486 4178261 3753965 124810
Greece 280039 385444 93764 11641
Russia 263604 4917906 4538909 115393
Mexico 259873 2390140 1909187 221080
Ireland 226565 254870 23364 4941
Canada Alberta 220117 222279 0 2162
Ukraine 207478 2227400 1969055 50867
Canada British Columbia 139292 140953 0 1661
Manitoba 45897 46916 0 1019
Saskatchewan 44606 45128 0 522
France Mayotte 17041 20176 2964 171
Guadeloupe 13584 16079 2250 245
French Guiana 12010 22115 9995 110
Martinique 11481 11669 98 90
Canada Nova Scotia 4991 5065 0 74
New Brunswick 2055 2098 0 43
France Reunion 1681 23566 21709 176
Canada Newfoundland and Labrador 1210 1216 0 6
Nunavut 632 636 0 4
France Saint Barthelemy 542 1005 462 1
St Martin 504 1915 1399 12
Canada Prince Edward Island 199 199 0 0
Northwest Territories 126 126 0 0
United Kingdom Bermuda 85 2483 2366 32
Canada Yukon 82 84 0 2
Netherlands Sint Maarten 74 2346 2245 27
Aruba 70 10892 10716 106
France New Caledonia 67 125 58 0
Netherlands Curacao 64 12266 12080 122
France French Polynesia 39 18841 18661 141
United Kingdom British Virgin Islands 38 248 209 1
Netherlands Bonaire, Sint Eustatius and Saba 19 1606 1570 17
United Kingdom Cayman Islands 18 574 554 2
Channel Islands 17 4059 3956 86
Canada Repatriated Travellers 13 13 0 0
Grand Princess 13 13 0 0
United Kingdom Turks and Caicos Islands 11 2407 2379 17
Isle of Man 4 1591 1558 29
Anguilla 2 109 107 0
France Wallis and Futuna 2 445 436 7
United Kingdom Saint Helena, Ascension and Tristan da Cunha 0 4 4 0
France Saint Pierre and Miquelon 0 25 25 0
United Kingdom Falkland Islands (Malvinas) 0 63 63 0
Gibraltar 0 4286 4192 94
Montserrat 0 20 19 1

Plotting the cases by day

We now plot the data for some countries using a composition of features.

In [6]:
analysis.display_locations(countries = [ 'Italy', 'Portugal', 'Spain' ],
                           fill = False,
                           logy = False)



Active Infected Recovered Deaths
Country Province
Italy 299486 4178261 3753965 124810



Active Infected Recovered Deaths
Country Province
Spain 3401684 3631661 150376 79601



Active Infected Recovered Deaths
Country Province
Portugal 22193 843729 804522 17014

Interactive Sunburst Pie Charts

Now we present the interactive maps where you can select the region/subregion of the cases for more details.

In [7]:
analysis.display_sunburst_chart(label = 'Active')
In [8]:
analysis.display_sunburst_chart(label = 'Recovered')
In [9]:
analysis.display_sunburst_chart(label = 'Deaths')

Querying dataset

We now execute some queries to our dataset for countries with most recoveries and over 2k infections with additional locations.

In [10]:
interesting_locations = {
    ('United Kingdom', ''),
    ('Brazil', ''),
}
analysis.display_locations(
    locations = interesting_locations,
    query='Infected > 50000000 & Active < Recovered & Deaths > 200000',
    provinces = False
)
Active Infected Recovered Deaths
Country Province



Active Infected Recovered Deaths
Country Province
Brazil 1369911 15894094 14080089 444094



Active Infected Recovered Deaths
Country Province
United Kingdom 4327520 4455221 0 127701

Selected Countries and Locations

Now we present the additional charts of the cases around the world.

In [11]:
analysis.display_locations(countries = 'Netherlands', logy=False, provinces = True)



Active Infected Recovered Deaths
Country Province
Netherlands Aruba 70 10892 10716 106



Active Infected Recovered Deaths
Country Province
Netherlands Sint Maarten 74 2346 2245 27



Active Infected Recovered Deaths
Country Province
Netherlands 1597994 1615500 0 17506



Active Infected Recovered Deaths
Country Province
Netherlands Curacao 64 12266 12080 122



Active Infected Recovered Deaths
Country Province
Netherlands Bonaire, Sint Eustatius and Saba 19 1606 1570 17

Animated Geographic Map

Now we present the animation of the cases around the world since January of 2020. Presented on https://blog.njaniceto.com/demos/geographic-evolution-2019-20-pandemic/.

In [12]:
# Generate an animated geographic map
analysis.display_geomap()

Final Remarks

The author

Nuno André Jeremias de Aniceto is a Technology Consultant with experience in Software Engineering; Software Architecture and DevOps.
Holds a Master degree in Computer Science Engineering with focus on Computer Vision; Big Data; Multimedia and 3D Simulations.
Has specializations on Deep Learning and on Data Engineering on Google Cloud Platform.

The source of the data

The datasets are compiled by the Johns Hopkins University and the datasources themselves may present some issues (such as Canada province "Recovered").

As of 2020-03-28 the datasources are:

References