Customer Acquisition Cost (CAC) Analysis is a critical aspect of business strategy where Data Science plays a vital role. CAC refers to the cost a company incurs to acquire a new customer. Understanding and optimizing this cost is crucial for sustainable growth and profitability. If you want to learn how to analyze the customer acquisition cost of a business, this article is for you. In this article, I’ll take you through the task of Customer Acquisition Cost Analysis using Python.
Customer Acquisition Cost Analysis: Process We Can Follow
Customer Acquisition Cost Analysis is a valuable tool for businesses to assess the efficiency and effectiveness of their customer acquisition efforts. It helps make informed decisions about resource allocation and marketing strategies, ultimately contributing to the company’s growth and profitability.
Below is the process we can follow for the task of customer acquisition cost analysis as a Data Science professional:
- Begin by collecting relevant data related to customer acquisition expenses.
- Segment your customer acquisition costs to understand which channels or strategies are driving customer acquisition.
- Identify key metrics that will help you calculate CAC.
- Calculate CAC for each customer acquisition channel or strategy.
- Analyze and find patterns to optimize your CAC.
So, the process of Customer Acquisition Cost starts with collecting data on customer acquisition cost expenses. I found an ideal dataset for this task. You can download the data from here.
Customer Acquisition Cost Analysis using Python
Let’s get started with the task of Customer Acquisition Cost Analysis by importing the necessary Python libraries and the dataset:
import pandas as pd import plotly.express as px import plotly.io as pio import plotly.graph_objects as go pio.templates.default = "plotly_white" data = pd.read_csv("customer_acquisition_cost_dataset.csv") print(data.head())
Customer_ID Marketing_Channel Marketing_Spend New_Customers 0 CUST0001 Email Marketing 3489.027844 16 1 CUST0002 Online Ads 1107.865808 33 2 CUST0003 Social Media 2576.081025 44 3 CUST0004 Online Ads 3257.567932 32 4 CUST0005 Email Marketing 1108.408185 13
Let’s have a look at the column insights before moving forward:
data.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 500 entries, 0 to 499 Data columns (total 4 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Customer_ID 500 non-null object 1 Marketing_Channel 500 non-null object 2 Marketing_Spend 500 non-null float64 3 New_Customers 500 non-null int64 dtypes: float64(1), int64(1), object(2) memory usage: 15.8+ KB
Now, let’s calculate the customer acquisition cost:
data['CAC'] = data['Marketing_Spend'] / data['New_Customers']
Here, we calculated and added a CAC value to the dataset, helping the company understand how efficiently it is acquiring customers through its marketing efforts and which marketing channels are more cost-effective for customer acquisition.
Now, let’s have a look at the CAC by marketing channels:
fig1 = px.bar(data, x='Marketing_Channel', y='CAC', title='CAC by Marketing Channel') fig1.show()

So, the customer acquisition cost of Email marketing is the highest and social media is the lowest. Now, let’s have a look at the relationship between new customers acquired and CAC:
fig2 = px.scatter(data, x='New_Customers', y='CAC', color='Marketing_Channel', title='New Customers vs. CAC', trendline='ols') fig2.show()

So, the negative slope of the trendline in the above graph suggests that there is a tendency for channels with a higher number of new customers to have a lower CAC. In other words, as marketing efforts become more effective in acquiring customers, the cost per customer tends to decrease.
Now let’s have a look at the summary statistics of all the marketing channels:
summary_stats = data.groupby('Marketing_Channel')['CAC'].describe() print(summary_stats)
count mean std min 25% \ Marketing_Channel Email Marketing 124.0 132.913758 89.597107 23.491784 68.226195 Online Ads 130.0 122.135938 79.543793 24.784414 62.207753 Referral 128.0 119.892174 74.101916 22.012364 71.347939 Social Media 118.0 126.181913 77.498788 21.616453 75.633389 50% 75% max Marketing_Channel Email Marketing 106.940622 177.441898 434.383446 Online Ads 97.736027 163.469540 386.751285 Referral 99.835688 137.577935 366.525209 Social Media 102.620356 167.354709 435.487346
By understanding the above summary statistics, you can:
- Use the mean CAC values to compare the average cost of customer acquisition across different Marketing Channels. For example, if minimizing CAC is a priority, you may want to focus on channels with lower average CAC values.
- Use the standard deviation to assess the consistency of CAC within each channel. Higher standard deviations suggest greater variability, which may require further investigation to understand the reasons behind the fluctuation in costs.
- Use quartiles to understand a sense of the distribution of CAC values. For example, if you want to target cost-effective customer acquisition, you might focus on channels where the first quartile (25%) has relatively low CAC values.
- Similarly, the minimum and maximum CAC values give you an idea of the range of costs associated with each channel, helping you understand the potential cost extremes.
Now, let’s calculate the conversion rate of this marketing campaign:
data['Conversion_Rate'] = data['New_Customers'] / data['Marketing_Spend'] * 100
Here are the insights into the conversion rate by marketing channel:
# Conversion Rates by Marketing Channel fig = px.bar(data, x='Marketing_Channel', y='Conversion_Rate', title='Conversion Rates by Marketing Channel') fig.show()

So, we can see that the conversion rates of online ads are better than all other channels.
Now, let’s calculate the break-even customers for this marketing campaign. Break-even customers refer to the number of new customers that a company needs to acquire through a specific marketing channel to cover the costs associated with that marketing channel. When the actual number of new customers acquired through the channel exceeds the break-even number, it indicates that the marketing efforts are generating more revenue than the costs, resulting in a profit. Here’s how to find break-even customers for each marketing channel:
data['Break_Even_Customers'] = data['Marketing_Spend'] / data['CAC'] fig = px.bar(data, x='Marketing_Channel', y='Break_Even_Customers', title='Break-Even Customers by Marketing Channel') fig.show()

Now, let’s compare the actual customers acquired with the break-even customers for each marketing channel:
fig = go.Figure() # Actual Customers Acquired fig.add_trace(go.Bar(x=data['Marketing_Channel'], y=data['New_Customers'], name='Actual Customers Acquired', marker_color='royalblue')) # Break-Even Customers fig.add_trace(go.Bar(x=data['Marketing_Channel'], y=data['Break_Even_Customers'], name='Break-Even Customers', marker_color='lightcoral')) # Update the layout fig.update_layout(barmode='group', title='Actual vs. Break-Even Customers by Marketing Channel', xaxis_title='Marketing Channel', yaxis_title='Number of Customers') # Show the chart fig.show()

So, this shows a positive result of the marketing campaign as the actual customers acquired from all marketing channels exactly match the break-even customers. If the actual customers acquired were short of the break-even point, it would have indicated a need to reassess marketing strategies or allocate additional resources to those channels.
Summary
So this is how you can perform Customer Acquisition Cost Analysis using Python. Customer Acquisition Cost Analysis is a valuable tool for businesses to assess the efficiency and effectiveness of their customer acquisition efforts. It helps make informed decisions about resource allocation and marketing strategies, ultimately contributing to the company’s growth and profitability. I hope you liked this article on Customer Acquisition Cost Analysis using Python. Feel free to ask valuable questions in the comments section below.