Online Metrics for Models Evaluation in Production

Offline metrics provide key insights in the model training and evaluation phase, However, offline metrics are meaningless once the model has been deployed in the real world. One needs Online metrics in production &#8212 to measure the impact of the product offering on the actual user, their interaction and business outcomes.

The AARRR framework, known as "Pirate Metrics," is a go-to framework to measure product performance across key user journey stages. Let's first understand what AARRR stands for:

Acquisition. These metrics track how users discover and begin engaging with your product — fundamental for growth strategies and identification of the most effective channels for reaching the target audiences.
Activation and User Experience. This key phase measures how quickly users can recognize the product's value.
Retention. With retention, we aim to measure ongoing user engagement and long-term product stickiness.
Revenue Generation. Revenue metrics directly connect user behaviour to business outcomes. This includes measuring conversion rates, average revenue per user, customer lifetime value, and transaction patterns.
Referral Impact. Here, we evaluate how users share and recommend the product to others. This may include tracking referral rates, viral coefficients, and the effectiveness of word-of-mouth campaigns.

1. Acquisition Metrics

Total Leads Generated. This is the total number of potential customers from various marketing channels and campaigns — email signups, contact form submissions, trial registrations, and other expressions of interest. A high-quality lead typically provides contact information and shows genuine interest in the product or service. Lead scoring systems can assess the quality of leads.
Customer Acquisition Cost (CAC). This includes the cost of converting a prospect into a customer and might include marketing, sales, and promotional expenses. CAC is calculated by dividing total acquisition expenses by the number of new customers acquired within a specific period. Benchmarks and customer lifetime value (LTV) can help determine the acceptable CAC levels for sustainable business growth.

$CAC = \frac{\text{Total Acquisition Costs}}{\text{Number of New Customers Acquired}}$

Conversion Rate. This is a percentage of leads that complete a desired action like purchase, signup, or download. This metric can be measured at various funnel stages, from initial website or app visits to final purchases or enrollment. Conversion rates vary significantly across traffic sources, devices, and user segments. Understanding these variations helps with marketing strategies.
Cost Per Lead (CPL). CPL measures the average expense required to generate a single lead, calculated by dividing total lead generation costs by the number of leads acquired. This metric should be analyzed alongside lead quality metrics to ensure marketing efforts generate valuable prospects, not just high volumes of low-quality leads.

$CPL = \frac{\text{Total Lead Generation Costs}}{\text{Number of Leads Acquired}}$

Lead-to-Customer Rate/Cost Per Lead. The percentage of leads that convert into paying customers. A low rate might indicate poor lead quality, ineffective sales processes, or misalignment between marketing messages and actual product value. Tracking this rate over time helps identify seasonal patterns and long-term trends in product sales.

2. Activation

Activation Rate. The percentage of users who achieve the initial milestone that signifies product activation. This could include completing a setup processes, first meaningful interactions, or other defined success criteria. Mostly these metrics varies by product type — for social platforms, it might be making the first connection, while for productivity tools, it could be creating the first project.
Time to First Key Action. This is the average time it takes for a new user to complete an in-product action after signing up. This metric directly measures the onboarding process and initial user engagement success. Shorter times typically indicate easy user understanding of the product. Regularly monitoring this across cohorts may help optimize the path to the first key actions through improved UI design, documentation, or user guidance.
Product Usage in the First Week. Measures how frequently a new user interacts with core features during their first week. This includes tracking daily active usage, session duration, and interaction patterns with key product functionality. Usually, first-week usage patterns strongly indicate long-term engagement potential and help identify early churn risks.
Feature Adoption Rate. This is the percentage of users engaging with specific features after onboarding. This metric tracks how successfully users discover and utilize different product offerings beyond basic functionality. Feature adoption patterns may help identify aspects of the product that resonate with users and might need better visibility or explanation.

3 Retention and User Engagement

Churn Rate. This is the proportion of users who discontinue using a product or cancel their service within a specified time frame. The period can vary depending on the industry — it can be days, months, or even years. While analysing this, take into consideration voluntary churn (users actively cancelling) and involuntary churn (payment failures or technical issues). Understanding churn patterns across user segments and pricing tiers helps teams implement targeted retention strategies.
Customer Retention Rate (CRR). CRR represents the percentage of customers who continue using a product over a defined period. It is a good indicator of product stickiness and customer satisfaction. A high CRR often reflects a strong product-market fit and successful customer success strategies. CRR is typically evaluated over various timeframes, such as 30 days, 90 days, or annually, and across different user segments to gain insights into long-term engagement trends.
Repeat Purchase Rate. This metric measures the number of customers who make multiple purchases within a specified timeframe. Depending on your goal, it might be for different or similar products. By looking at this metric, you can identify which products or services drive repeat purchases and determine the customer segments with the highest loyalty. Insights into purchase frequency, order value trends, and the time between purchases can guide marketing efforts and product strategies.
Cohort Retention Rate. Cohort retention analysis evaluates how users with specific characteristics engage with a product over time. It reveals retention trends and helps determine whether product changes or updates have positively impacted user engagement.
Net Dollar Retention Rate (NDRR). Net Dollar Retention Rate (NDRR) measures the change in recurring revenue from existing customers over a specific period, accounting for upgrades, downgrades, and churn. This metric is essential in subscription-based businesses, reflecting customer satisfaction and growth potential. A high NDRR indicates effective upselling (intent to sell a better version to a customer) and cross-selling (intent to sell a related product) strategies. To maximize NDRR, you should analyze the factors influencing both upgrades and downgrades and refine pricing and packaging strategies accordingly.

The following metrics in this subsection are commonly used for measuring customer engagement, and they include:

Active Users (DAU/MAU). With DAU/MAU, we aim to quantify the number of unique users interacting with a product daily and monthly. Monitoring DAU and MAU provides insights into user engagement and growth trends.
Session Duration. It represents the average time users spend in the app or on the platform during a single session. Longer session durations indicate that users find the content engaging and will spend more time exploring the product. However, it's essential to ensure that extended sessions result from positive engagement rather than user confusion or difficulty navigating the platform.
Stickiness (DAU/MAU Ratio). Stickiness is calculated by dividing DAU by MAU, providing a ratio that reflects how often users return to the product within a month. A higher stickiness ratio signifies that users consistently engage with the product, which is a good sign for growth and sustainability. For example, a DAU/MAU ratio of 50% would mean that the average user engages with the product 15 out of 30 days a month.
Net Promoter Score (NPS). NPS measures customer loyalty by asking users how likely they are to recommend the product to others on a scale of, say, 0 to 10. A higher NPS indicates a greater likelihood of users recommending the product, which can drive organic growth through word-of-mouth referrals.
Pages Per Session. This metric represents the average number of pages or screens a user views during each session. A higher number suggests that users are exploring more content, which can indicate effective navigation and engaging material. Conversely, if users view many pages but do not take desired actions (e.g., making a purchase), it may indicate issues with content relevance or user experience.
Drop-off Rate. The drop-off rate measures the percentage of users who abandon the product at various stages of interaction. Identifying where users disengage helps pinpoint problem areas within the user journey.

4. Revenue Generation Metrics

Customer Lifetime Value (CLV or LTV). It estimates the total revenue a business can expect from a single customer over the entire duration of their relationship. By understanding CLV, companies can make informed decisions about customer acquisition costs and retention strategies, ensuring that the projected revenue justifies the investment in acquiring a customer. For instance, if a customer subscribes to a service for Ugx 100,000 annually and is expected to remain loyal for five years, the CLV would be Ugx 500,000.
Average Revenue Per User (ARPU). ARPU calculates the mean revenue generated per user over a specific period, typically monthly or annually. In subscription-based businesses, ARPU provides insight into revenue generation per user. By dividing total revenue by the number of active users, you can identify potential areas for revenue enhancement.
Revenue Per Paying User (RPPU). Like ARPU, RPPU focuses on users who have made purchases, making it particularly useful for freemium models where only a subset of users are paying customers. By looking at RPPU, businesses can evaluate the spending behaviour of their paying customers, which aids in tailoring premium offerings and optimizing pricing strategies to maximize revenue from this segment.
Gross Margin. Represents the difference between revenue and the cost of goods sold (COGS), expressed as a percentage. A higher gross margin suggests profitability, as more revenue is retained after covering the direct costs associated with production. For example, a gross margin of 60% means that for every shilling earned, 60 cents is retained as gross profit.
Revenue Growth Rate. This metric measures the rate at which a company's revenue increases over a specific period, usually quarterly or annually. It's a key indicator of business expansion and market demand. Sustained revenue growth reflects successful business strategies and a strong market position, while declining growth may signal the need for strategic reassessment.
Churned Revenue. Churned revenue quantifies the revenue lost due to customers discontinuing the service within a given period. High churned revenue can indicate customer dissatisfaction or competitive pressures. Monitoring this metric helps identify retention issues and development of strategies to reduce customer attrition, thereby stabilizing revenue streams.

5 Referral Metrics

Viral Coefficient. This metric quantifies how many new users each existing user brings in through referrals, indicating the potential for viral growth. It's calculated by multiplying the average number of referrals per user by the conversion rate of those referrals. A viral coefficient greater than 1 suggests that each user, on average, generates more than one new user, leading to exponential growth — No market is infinite, so as you acquire more users, the pool of potential new users shrinks, naturally reducing the viral coefficient over time, while the viral coefficient rarely stays constant since early adopters are often more enthusiastic about sharing than later users and network effects can weaken as the product becomes mainstream.
Referral Rate. Referral Rate measures the percentage of existing customers who refer new customers to a product or service. A higher referral rate indicates that customers are satisfied and willing to advocate for the brand, contributing to organic growth.
Social Shares. This metric tracks the number of times users share the product or content on social media platforms. High social share counts can amplify brand awareness, drive traffic, and increase conversions. Monitoring social shares helps in understanding the reach and impact of the content marketing efforts.
Referral Conversion Rate. This metric represents the percentage of referred users who take a desired action, such as signing up or purchasing after being referred. A higher referral conversion rate indicates the effectiveness of referral programs and the quality of leads generated through referrals.

While the metrics we've covered serve as a foundation, they may not fully capture the details of your specific industry or business objective. Some metrics can overlap categories in the AARRR framework. When evaluating your product performance in the real world, focus on selecting metrics that align closely with your unique objectives rather than strictly adhering to AARRR.

Online Metrics for Models Evaluation in Production

Previous Article

Next Article