How to calculate the Customer Retention Rate (CRR)? [Shopify + Python example]

The Customer Retention Rate (CRR) is an important measure often requested by managers in an online business. You can think of it as the rate at which customers continue to do business with you in a given period of time.

The Customer Retention rate is often considered as one of the core KPI, usually requested by managers. Indeed the CRR gives you a great insight into the health of the business. It is because the ability to retain customers is critical for business success.

In this article, I will not linger too much on the properties of the CRR. I am assuming you already know the importance of the CRR. Hence, in this article, I will show you how to calculate the Customer Retention Rate using Python on a Shopify store.

How do you calculate the Customer Retention Rate (CRR)?

You can calculate the Customer Retention Rate by using a simple formula that has three variables. The idea is to take the difference between the number of customers at the end of the period (E) and the new customer acquired (N). To that difference, we will divide by the number of customers at the start of the period (S).

The CRR’s mathematical representation is CRR = ((E-N)/S) x 100

Practically, let’s assume you had 200 customers at the end of 2019 (S), and you want to know your Customer Retention Rate for 2020. At the end of 2020, you had 375 customers (E), where 175 were new ones (N).

Consequently, we will have ((375 – 235))/200 = 140/200= 0.7 x 100 = 70%. The Cusotmer Retention Rate for that store is therefor 70%.

How do you calculate the Customer Retention Rate on a large Shopify Dataset?

You should use this method in case you do not have any experience coding. It is a manual method and makes use of the reports generated by Shopify. Log in to your Shopify Admin account, and go to Analytics > Reports. Next, scroll down to the Customer section and select First-time vs returning customer sales.

Assuming that we want to calculate the yearly retention rate the previous year, here are the next steps.

1. On the date range dropdown, select the year before the year you want to calculate the CRR (aka previous period). Since I want to know the CRR of 2020, I will first select 2019 (01-01-2019 to 31-12-2019).
2. On the edit columns, you need to tick the Customers and Customer Type variable.
3. Record your “Start of period customers (S)” on the top of your table under Customers.  (30K)
4. On the date range dropdown, select the date range you want to calculate the CRR (period). In my case, it is going to be 2020 (01-01-2020 to 31-01-2020).
5. Record your “End of period customers (E)” on the top of your table under Customers.  (70k)
6.  Record your “New Customer during the period (N)” on your table under First-time. (50k)
7. Finally, all you have to do is to plug those numbers into the formula. It means you have to make the difference between step 5 (E) and step 6 (N) and divided it by step 3 (S), as per the formula. In my case, we will have (E – N)/S = (70k – 50k) / 30k = 0.667 * 100 = 66.7%. Our CRR is ~67%

I hope everything is clear. If so, try it yourself in your store. If this information is valuable, consider subscribing.

You can do this using any report. You need to play with the data range and select the appropriate variable, and filter as shown above. Remember that you can do this per month. All you have to do is change the date range to the month of interest and apply the same procedure.

How do you calculate the Customer Retention Rate (CRR) on Shopify Data with Python?

1. Prepare the dataset

This method can work on Shopify data, but as well on any dataset that has the same structure. To calculate the Customer Retention Rate with Python on Shopify, we will need to extract our working dataset. Again, we are calculating the CRR for 2020. So you will need a dataset with the following variables: Year, Customer_Id, Customer Type, customer billing location (optional).

The procedure is as follow if you are a Shopify user. In your Shopify Admin, go under Analytics > Reports. Under the sale section, select sale over time. On that board, you will want to:

1. Select the date range of interest. In that case, under the date range board, Selected 01-01-2019 to 01-01-2020.
2. Under Manage/Edit Columns, you will want to tick Customer IDCustomer Type, Customer Billing’s Location. You can untick the rest if you do not need it.
3. On top of the page, go to Export > Export.

And you are set, save/move the file to a known location in your machine and open your favorite IDE. Note that you can use the Shopify API to extract the dataset above.

2. Calculating the Yearly Customer Retention Rate (CRR) with Python

Import the required libraries. In this case, we will only need Pandas.

import pandas as pd
df = pd.read_csv("Path/to/file.csv")

Calculating the Overall Yearly Customer Retention Rate with Python/Pandas

You can calculate the yearly Customer retention rate using few lines of code.  The idea is to extract the various variable by filtering and counting the unique customer per group.


def customer_retention_rate_year(df, end_year):
try:
# Get all the customers in 2019 ("E")
S = df.loc[df.year == end_year-1]\
.customer_id.nunique()
# Get all the first time customers in 2020 ("N")
N = df.loc[df.year == end_year]\
.loc[df.customer_type == "First-time"]\
.customer_id.nunique()
# Get the number of customers in 2020
E= df.loc[df.year == end_year]\
.customer_id.nunique()
return ((E-N)/S)*100
except ZeroDivisionError:
return 0

print(customer_retention_rate_year(df, 2020))
#70.134


Calculating the Yearly Customer Retention Rate by location with Python/Pandas

using the function customer retention rate function above. All we have to do is pass the filtered dataset through the customer_retention_rate_year() function. Here’s how you do it and create the final dataset.

# Get the list of location in the dataframe
states = sorted(df.billing_region.unique())
# Save the Customer Retention Rate per location
crr = [customer_retention_rate_year(\
df.loc[df.billing_region == state], 2020)
for state in states]
# Save the results in a dataframe
res = pd.DataFrame()
res["State"] = states
res["CRR"]= crr
# Export the results
res.to_excel("CRR.xlsx")

The final result will look something like this.

There you have it. You have the yearly customer retention rate calculated with Python. Knowing a bit of Python can save you a lot of time. Yes, the CRR formula is easy to apply to a store in general. But you can imagine how tedious it is to apply the same process to all your customer segments. The python method allows you to save a lot of types by wrangling and filtering the data to the desired segment. Then, you can pass the filtered data through the CRR function, and the job is done.

I hope this information was of use to you.

Feel free to use any information from this page. I’d appreciate it if you can simply link to this article as the source. If you have any additional questions, you can reach out to malick@malicksarr.com. If you want more content like this, join my email list to receive the latest articles. I promise I do not spam.