Summary

The notebook

Hi, this notebook page was written to analyse the 2019-2020 pandemic outbreak in Europe of the SARS-CoV-2 virus.

It was written to:

  • provide the current status of the outbreak from the authorities;
  • present these status with interactive charts;
  • write a small library to analyse and predict the plateau and duration of the disease;
  • add content to my own personal website with something new;

The disease

The disease COVID-19 is caused by the virus SARS-CoV-2 (Severe Accute Respiratory Syndrome 2).

Your help

This notebook served me to practice some older skills and eventually reach out to an interested audience.
In case you have a suggestion on how to improve the usefulness of the notebook, I will be thankful ( nuno.aja@gmail.com ).

Kind regards, The author.

NOTICE: ongoing work


Section I

In this section we focus on getting the current datasets, process and load them.

For convenience and security we implemented the adequate code to present the data.


In [1]:
import modules_loader
from modules.analytics.sars_cov2_2019_20 import analytics
In [2]:
analysis = analytics.scenario()
available_countries = analysis.download_source_datafiles(force = True)
Matplotlib created a temporary config/cache directory at /tmp/matplotlib-9xqax9d0 because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
In [3]:
target_cases = analysis.process_dataset(available_countries)
Preparing dataset for 192 coutries.

Section II

In this section we focus on the current data exploration.


Selecting the (20) countries with most active cases

We now select countries by active cases number and plot the resulting data on a bar chart.

Our function plots the data automatically whilst saving the chart as well.

An equivalent code would be the following:

top_20_countries_active_cases = analysis.statistics_by_country.sort_values(by='Active', ascending=False).head(20)
top_20_countries_active_cases.iplot(kind='bar', subplots=False, title='The (20) countries with most active cases');
In [4]:
top_20_countries_active_cases = analysis.show_top_cases(20, feature = 'Active')

The 20 countries with most active cases

Following with the data for these selected countries.

In [ ]:
display(analysis.statistics_by_province.loc[list(top_20_countries_active_cases.index)].sort_values(by='Active', ascending=False))

Plotting the cases by day

We now plot the data for some countries using a composition of features.

In [ ]:
analysis.display_locations(countries = [ 'US', 'Italy', 'Portugal', 'Spain' ],
                           fill = False,
                           logy = False)

Interactive Sunburst Pie Charts

Now we present the interactive maps where you can select the region/subregion of the cases for more details.

In [7]:
analysis.display_sunburst_chart(label = 'Active')
In [8]:
analysis.display_sunburst_chart(label = 'Recovered')
In [9]:
analysis.display_sunburst_chart(label = 'Deaths')

Querying dataset

We now execute some queries to our dataset for countries with most recoveries and over 2k infections with additional locations.

In [10]:
interesting_locations = {
    ('United Kingdom', ''),
    ('Brazil', ''),
}
analysis.display_locations(
    locations = interesting_locations,
    query='Infected > 50000000 & Active < Recovered & Deaths > 200000',
    provinces = False
)
Active Infected Recovered Deaths
Country Province



Active Infected Recovered Deaths
Country Province
Brazil 1369911 15894094 14080089 444094



Active Infected Recovered Deaths
Country Province
United Kingdom 4327520 4455221 0 127701

Selected Countries and Locations

Now we present the additional charts of the cases around the world.

In [11]:
analysis.display_locations(countries = 'Netherlands', logy=False, provinces = True)



Active Infected Recovered Deaths
Country Province
Netherlands Aruba 70 10892 10716 106



Active Infected Recovered Deaths
Country Province
Netherlands Sint Maarten 74 2346 2245 27



Active Infected Recovered Deaths
Country Province
Netherlands 1597994 1615500 0 17506



Active Infected Recovered Deaths
Country Province
Netherlands Curacao 64 12266 12080 122



Active Infected Recovered Deaths
Country Province
Netherlands Bonaire, Sint Eustatius and Saba 19 1606 1570 17

Animated Geographic Map

Now we present the animation of the cases around the world since January of 2020. Presented on https://blog.njaniceto.com/demos/geographic-evolution-2019-20-pandemic/.

In [12]:
# Generate an animated geographic map
analysis.display_geomap()

Final Remarks

The author

Nuno André Jeremias de Aniceto is a Technology Consultant with experience in Software Engineering; Software Architecture and DevOps.
Holds a Master degree in Computer Science Engineering with focus on Computer Vision; Big Data; Multimedia and 3D Simulations.
Has specializations on Deep Learning and on Data Engineering on Google Cloud Platform.

The source of the data

The datasets are compiled by the Johns Hopkins University and the datasources themselves may present some issues (such as Canada province "Recovered").

As of 2020-03-28 the datasources are:

References