What is Product-Market Fit?
Product-Market Fit (PMF) refers to how well a product satisfies a significant market demand. The term, introduced by Marc Andreessen in 2007, describes a scenario where a startup finds a substantial group of customers who truly resonate with its product. In simpler terms, PMF is when a product effectively meets the needs of a broad market, signifying that the startup is on the right track.
Quantitatively PMF can be assessed using various business metrics that indicate a product's acceptance and demand. Key indicators include growth rate, retention rate, Net Promoter Score (NPS), churn rate, and unit economics. For instance, a consistent and accelerating growth rate in terms of revenue or user activity suggests increasing demand. High retention rates indicate that customers find ongoing value in the product, while a high NPS signifies strong customer satisfaction and willingness to recommend the product. Low churn rates suggest that customers are staying and potentially increasing their engagement or spending. Positive unit economics, where the lifetime value (LTV) of a customer exceeds the cost of acquiring that customer (CAC), point to sustainable growth. These metrics can be combined into a composite PMF score using methods like weighted averages, geometric means, or principal component analysis.
What Product Market fit is not?
There are several misconceptions about what constitutes PMF. One common misconception is that high initial growth signifies PMF. Founders often assume that an initial spike in user acquisition or revenue means they have achieved PMF. However, initial growth can be driven by marketing spend, promotions, or other short-lived factors. True PMF is reflected in sustained growth and high retention rates over time.
Another misconception is that a feature-rich product automatically equates to PMF. Some believe that having many features means the product meets market needs. In reality, a feature-rich product is not necessarily one that effectively solves the target customers’ problems. PMF is about delivering the right features that address the customers' needs.
Market validation is another area where misconceptions arise. Receiving positive feedback from a few industry experts or early adopters is often mistaken for PMF. True PMF is validated by the behavior of a broad customer base, not just a few enthusiasts or insiders.
Finally revenue generation, significant revenue generation is also often seen as a sign of PMF. While revenue is important, it can be misleading if not accompanied by high customer satisfaction and retention. Short-term revenue might not reflect long-term viability.
How to Quantify Product-Market Fit with Basic Math, Business Metrics, and Python
⚠️ Disclaimer : This article is tailor-made for those who appreciate the beauty of quantitative analysis over qualitative musings. As a product manager with a background in data science, I might throw in a bit of technical lingo and some fancy terms. But don’t worry, I promise to keep it light and fun. By the end of this, you'll be a whiz at calculating your PMF score.
Identifying Product-Market Fit (PMF) isn't just about feeling like you've hit the jackpot; it's about quantifiable evidence. Here's how you can measure PMF using the following components
1. Growth Accounting Framework
2. Cohort Analysis
3. Distribution of Product Market Fit
Growth accounting framework
Key Components to Measure
-
New: Revenue or user activity from new customers in the current period.
-
Churned: Revenue or activity lost from customers who were active before but not now.
-
Resurrected: Revenue or activity from customers who returned after previously churning.
-
Expansion: Increased revenue or activity from existing customers.
-
Contraction: Decreased revenue or activity from existing customers who haven't churned.
-
Retained: Revenue or activity carried over from the previous period.
Calculating These Components
To determine these components, you can start with a few basic equations:
- Total Revenue in Current Period:
Total Revenue(t) = retained(t) + new(t) + resurrected(t) + expansion(t)
- Total Revenue from Previous Period:
Total Revenue(t-1) = retained(t) + churned(t) + contraction(t)
- Change in Revenue
Total Revenue(t) – Total Revenue(t-1) = new(t) + expansion(t) + resurrected(t) – churned(t) – contraction(t)
- Growth Rate in Percentage
Growth_rate ~ New_rate + Resurrected_rate + Expansion_rate – Contraction_rate – Churn_rate
To gain deeper insights into your PMF, consider these key metrics:
- Gross Retention Rate:
Gross Retention = Retained Revenue(t) / Total Revenue(t-1)
- Quick Ratio:
Quick Ratio = (New Revenue(t) + Retained Revenue(t) + Expanded Revenue(t)) / (Churned Revenue(t) + Contracted Revenue(t))
- Net Churn Rate
Net Churn = (Churned Revenue(t) + Contracted Revenue(t) – Resurrected Revenue(t) – Expanded Revenue(t)) / Total Revenue(t-1)
Here's how I've calculated this using python
Growth Accounting for users
I’ve created a Python function for calculating Growth Accounting where you can analyze how well your product fits the market using customer behavior data. Here’s a brief overview of the GrowthAccounting class:
Purpose: This class helps businesses measure Product Market Fit (PMF) by analyzing customer interactions or revenue over different periods.
Initialization: Set up the class with your desired period (daily, weekly, monthly, etc.) and specify if your data is simple (each row represents a single interaction) or detailed (each row represents multiple interactions).
Metrics Calculated: I've also created functions to calculate the above mentioned metrics and Growth Rates
How to use it
#initialise the function
ga = GrowthAccounting(period='M', simple=True)
#fit the model with the data
ga.fit(data=my_data, column_date='date', column_id='user_id')
#plot the results
ga.plot()
This function helps you visualize and understand key growth metrics, enabling data-driven decisions to improve your product’s market performance.
Growth Accounting for revenue
I used the same python function to calculate the growth accounting for Revenue I just needed to use the columns user_id and revenue. In this case it is not simple Growth Accounting because we used these two columns
rev_growth=GrowthAccounting(period='M',simple=False)
#Fit the model
rev_growth.fit(revenue,'date','user_id','revenue')
#plot the revenue growth
rev_growth.plot()
class Cohorts:
def __init__(self, period='M', simple=True):
self.period = period.lower()
if self.period not in ['m', 'q', 'd', '28d', '7d']:
raise ValueError("Period should be one of these: m, q, d, 28d, 7d.")
self.simple = simple
self.arguments = []
def fit(self, data, column_date, column_id, column_input=None, how=[]):
df = data.copy()
df['unique_id'] = np.arange(len(df))
if self.simple:
column_input = 'column_input'
df[column_input] = 1
df['period'] = df[column_date].dt.to_period(self.period)
df['cohort'] = df.groupby(column_id)['period'].transform('min')
self.df_period_cohort = df
cohorts = df.groupby(['cohort', 'period'])[[column_input]].count().unstack().fillna(0).stack().reset_index()
cohorts = cohorts[cohorts['cohort'] <= cohorts['period']]
cohorts['period_num'] = cohorts.groupby('cohort').cumcount()
self.period_list = cohorts['period'].unique()
self.cohort_list = cohorts['cohort'].unique()
self.df_cohorts = cohorts[['cohort', 'period', 'period_num']]
def apply_unique_users(self, column_id):
df = self.df_period_cohort
df_cohorts = self.df_cohorts.set_index(['cohort', 'period'])
df_cohorts['unique_users'] = df.groupby(['cohort', 'period'])[column_id].nunique().reset_index(drop=True)
df_cohorts = df_cohorts.reset_index().groupby('cohort').apply(lambda x: x.assign(perc_unique_users=x['unique_users'] / x['unique_users'].iloc[0]))
self.df_cohorts = df_cohorts
def plot_heatmap(self, label, title, way='period'):
if 'unique_users' not in self.df_cohorts.columns:
self.apply_unique_users(self.column_id)
cohort_size = self.df_cohorts.groupby('cohort')['unique_users'].first().rename('cohort_size')
df_coh = self.df_cohorts.set_index(['cohort', way])[label].unstack()
max_coh, max_per = len(self.cohort_list) * 0.75, len(self.period_list) * 1.5
fig, (ax1, ax2) = plt.subplots(1, 2, gridspec_kw={'width_ratios': [1, max(max_per/3, 3)]}, figsize=(max_per, max_coh))
cohort_size.sort_index(ascending=False).plot(kind='barh', width=0.9, color='grey', alpha=0.5, ax=ax1)
sns.heatmap(df_coh, cmap='coolwarm_r', center=1, vmin=0, vmax=2, annot=True, fmt='.0%', ax=ax2)
ax1.set_xlim(xmin=0)
ax1.set_yticklabels(pd.Series(self.cohort_list).sort_values(ascending=False))
ax1.set_title('Cohort size')
ax2.set_title(title)
fig.tight_layout()
plt.show()
How to use this
This gives you a tabular view of the customer cohort
users_cohort.df_cohorts.head()
This will give you the heatmap for the interaction cohort of the users who interacted with you application
users_cohort.plot_heatmap('total','Total Interactions',way='period')
This will give you the percentage of the interaction cohort
users_cohort.plot_heatmap('perc_total','Total Interactions',way='period_num')
Activity Retention Curve
The activity retention curve tells you about how the activity of the user is retained on the platform and how much percentage drop or increase is observed in the platform
users_cohort.plot_trends('perc_total','Activity retention',way='period_num')
From the chart we can infer that There's a really low retention of *Total interactions*, by the second month its lower than a 20%!
Now let's work on Unique user cohort
#This gives you a heatmap of unique users which came on your platform
users_cohort.plot_heatmap('unique_users','Users',way='period')
users_cohort.plot_trends('perc_unique_users','Logo retention',way='period_num')
For unique users, the trend is slightly better. It takes approximately six months for the user base to decline by 20%. This suggests that users engage heavily with the app in the first month, likely due to marketing efforts and the novelty factor. In the following months, they keep the app on their phones but use it only once or twice.
Now let's check how much customers left the platform or churned
#This will give you a detailed heatmap of this
users_cohort.plot_heatmap('churn_unique','Users churn',way='period')
#This will give you a percentage representation of user retention
users_cohort.plot_heatmap('perc_unique_users','Users',way='period_num')
💸 Churn happen at we saw in last charts.
After evaluating Churn, let's study more about the revenue of the cohort
#This will create the Revenue cohort for period M
revenue_cohort=Cohorts(period='M',simple=False)
# We'll use fit method to fit the model with revenue_cohort method
revenue_cohort.fit(revenue,'date','user_id','revenue',how=['total','churn_total','accum','per_user'])
# Total Revenue heatmap by cohort
revenue_cohort.plot_heatmap('total','Total Revenue',way='period')
# Revenue Cohort by percentage
revenue_cohort.plot_heatmap('perc_total','Total Revenue',way='period_num')
Now let's calculate Customer's Lifetime Value (LTV)
revenue_cohort.plot_heatmap('accum','Cohort LTV',way='period')
January 2023 had the highest influx of users, with a decline in new users in subsequent months. Early cohorts, especially January 2023, show higher LTVs, indicating these users generate more revenue over time. The steady increase in LTV across most cohorts is a positive sign of user retention and revenue growth.
revenue_cohort.plot_trends('accum','Cohort LTV',way='period_num')
The line graph confirms this trend, showing cumulative LTV growth for each cohort. January 2023 consistently has the highest LTV, while some cohorts, like April and May 2023, grow rapidly initially but stabilize later, indicating potential drops in activity or spending. The mean LTV line serves as a useful benchmark. Overall, these visualizations suggest strong user retention and growing engagement, though variability across cohorts may be due to different acquisition strategies or user behaviors.
PMF Distribution Framework
The third standard technique involves observing the distribution of product-market fit. When looking at revenue, we typically inspect the cumulative distribution function (CDF) of monthly revenue. This approach helps us understand what “typical” looks like, beyond just average contract value (ACV). While ACV is useful for understanding overall financial impact, it can be skewed by outliers. The CDF provides a clearer picture of the distribution, allowing us to see the median and other percentiles, which better represent the “typical” customer.
Calculate Revenue for Each Customer:
#this code will calculate the revenue per customer
revenue_cohort.plot_heatmap('per_user','Revenue per user',way='period')
Revenue Distribution by customer
# Distribution of Product-Market Fit: Revenue distribution by user
revenue_distribution = spot_the_lion_revenue.groupby('user_id').agg({'revenue': 'sum'}).reset_index()
revenue_distribution['revenue_bin'] = pd.qcut(revenue_distribution['revenue'], 10)
revenue_distribution_summary = revenue_distribution.groupby('revenue_bin').agg({'user_id': 'count', 'revenue': 'sum'}).reset_index()
# Visualization of revenue distribution
plt.figure(figsize=(12, 6))
sns.barplot(x='revenue_bin', y='revenue', data=revenue_distribution_summary)
plt.title('Revenue Distribution by User')
plt.xlabel('Revenue Bin')
plt.ylabel('Total Revenue')
plt.xticks(rotation=90)
plt.show()
Let's use all these framework to create composite PMF score
Create a weighted index using the key metrics.
For example:
-
Growth Rate (30%)
-
Gross Retention (20%)
-
Net Churn (20%)
-
Quick Ratio (20%)
-
Cohort LTV and Retention (10%)
pmf_score = (0.3 * growth_rate_score + 0.2 * gross_retention_score + 0.2 * net_churn_score + 0.2 * quick_ratio_score + 0.1 * cohort_ltv_retention_score)
Score Normalization:
- Normalize each metric to a scale of 0-10, then calculate the weighted average to get the PMF score.
Example calculation
Let's assume that from above calculations we get the following values of
growth_rate = 0.15
gross_retention = 0.95
net_churn = -0.05
quick_ratio = 4.5
cohort_ltv_retention = 0.85
Now let's again use Python to caluculate the PMF score
# Example data
growth_rate = 0.15 # 15%
gross_retention = 0.95 # 95%
net_churn = -0.05 # -5%
quick_ratio = 4.5 # 4.5x
cohort_ltv_retention = 0.85 # 85%
# Normalizing to 0-10 scale (example, assuming ideal conditions)
growth_rate_score = min(10, growth_rate * 100 / 15) # 10 if 15% or more
gross_retention_score = gross_retention * 10 # 10 if 100%
net_churn_score = (1 - net_churn) * 10 # 10 if -10% or better
quick_ratio_score = min(10, quick_ratio * 2) # 10 if 5x or more
cohort_ltv_retention_score = cohort_ltv_retention * 10 # 10 if 100%
# Composite PMF Score
pmf_score = (0.3 * growth_rate_score + 0.2 * gross_retention_score +
0.2 * net_churn_score + 0.2 * quick_ratio_score +
0.1 * cohort_ltv_retention_score)
print("PMF Score:", pmf_score)
PMF Score: 6.950000000000001
Interpretation of PMF Score
-
8-10: Strong Product-Market Fit: High growth rate, high retention, low or negative churn, efficient growth, and strong cohort metrics.
-
5-7: Moderate Product-Market Fit: Decent growth rate, acceptable retention, manageable churn, moderate growth efficiency, and average cohort metrics.
-
0-4: Weak Product-Market Fit: Low growth rate, poor retention, high churn, low growth efficiency, and weak cohort metrics.