Daily Percentage Change

MP
2 min readMay 20, 2020

Percentage of Covid numbers changed over the day

This is first in series of Statistical Charts I am going to plot as part of my Towards Building Covid 19 Tracker.

Daily Change is difference between today’s numbers and previous numbers.

Percentage Change would (DailyChange *100/ Today’s numbers)

With Pandas library in python one can do all this calculation with a single function call pct_change . Here is how you do it

df.pct_change()*100 # where df = data frame

Lets look at our sample data

date,active_count,cure_count,death_count,migrated,source

12/05/20 07:35:04,46008,22454,2293,1,https://www.mohfw.gov.in/

13/05/20 08:56:46,47480,24385,2415,1,https://www.mohfw.gov.in/

14/05/20 14:10:50,49219,26234,2549,1,https://www.mohfw.gov.in/

15/05/20 09:32:04,51401,27919,2649,1,https://www.mohfw.gov.in/

For Daily percentage change we would call as follows

`df = self.dataf.iloc[:,1:-1].pct_change()*100

Once we have this plotting this on Matplotlib needs some manual hand holding since now our data frame is spread across dataframe of pct values and series of date values.

We want to have this plot

Daily Percentage Change

There are four parts to this plot.This is an indication to create a Figure using subplots.

We start by getting handles to figure and subplots as follows

fig, ax = plt.subplots()

then we start adding information to figure with subplot handle

a. PCT values vs Recorded Date

ax.plot(date_df, vals) # plot date series on x-axis & pct values on y 

b. Grid to track Datapoints over Date

ax.grid(True) # enable grid for the plot

c. Highlighting Data points

To highlight data points, Matplotlib provides format strings. we use “o-” to indicate “o” to be used for data point and “-” for drawing a line between points

So instead of step “a” code we use following code

ax.plot(date_df, vals, '-o', label=txt)# we will come back to label plot

d. Plotting Values at the data points

Plotting values at a data point is not straight forward. Inorder to achieve this we need to use annotate function. annotate takes 2 parameters namely

  • Value
  • XY coordinate

In our case value is the pct value at the data point & xy coordinate is a combination of date index & pct value. Here is how to do that

plt.annotate(f"{round(val,2)}%",xy=[date_df[idx], val])

that’s it. This is all that is needed.

I will add github link soon for complete code.

--

--