D2C Growth

Cohort Analysis for Ecommerce: How to Build and Read Cohort Tables

Build and read ecommerce cohort analysis tables step by step. Learn the exact data requirements, retention formulas, benchmark curves by vertical, and the five mistakes that break cohort tables.

Siddharth Gangal 16 min read
Cohort Analysis for Ecommerce: How to Build and Read Cohort Tables
On this page
  1. What is cohort analysis in ecommerce?
  2. Why cohort analysis matters for ecommerce operators
  3. The data you need to build a cohort table
  4. How to build a cohort analysis table: step by step
  5. How to read a cohort table
  6. Ecommerce cohort retention benchmarks by vertical
  7. The five mistakes that break cohort tables
  8. From cohort analysis to action
  9. How Fairview handles cohort analysis
  10. Key takeaways

TL;DR

  • Cohort analysis groups customers by their first purchase month and tracks what percentage return in each subsequent month. It is the only format that reveals whether retention is improving or degrading over time.
  • You need three data points to build a cohort table: a unique customer ID, the date of each transaction, and the transaction amount. A spreadsheet with these three columns is sufficient for the first version.
  • The average D2C ecommerce retention rate is 31% at 12 months, per 2026 composite data. Food and beverage leads at 30-45%. Fashion and apparel trails at 18-30%. Subscription ecommerce outperforms at 45-65%.
  • The most common mistake is counting transactions instead of unique customers. One customer who buys three times in Month 2 should count as one retained customer, not three. This error alone can overstate retention by 20-40%.
  • Fairview builds cohort tables automatically from Shopify, Stripe, and connected sources. It tracks retention by acquisition channel, surfaces cohort LTV, and flags when a newer cohort underperforms an older one at the same elapsed time.

Most ecommerce operators know their repeat purchase rate. Few can tell you whether the customers they acquired last month are more or less likely to return than the customers they acquired six months ago. That is the question cohort analysis answers. Without it, you are flying blind on the single metric that determines whether your business compounds or churns through acquisition dollars.

This guide covers what cohort analysis is, why it matters for ecommerce, the exact data you need, how to build a cohort table from scratch, how to read the output, benchmark curves by vertical, the five mistakes that break most cohort tables, and how to turn the analysis into action. Pair it with ecommerce retention metrics, new vs returning customer revenue, and increasing LTV without ads.

What is cohort analysis in ecommerce?

Definition

Cohort analysis is the practice of grouping customers by the period of their first purchase and tracking how many of them make subsequent purchases over time. Each row represents one acquisition cohort. Each column represents a period elapsed since acquisition. The intersection shows retention.

The power of the format is comparison. A simple repeat purchase rate of 35% tells you one thing. A cohort table showing that the January cohort retained 42% at Month 3 while the April cohort retained only 28% at Month 3 tells you something far more specific. Your product, onboarding, or customer experience degraded between January and April. The aggregate number hides that signal. The cohort table surfaces it.

Cohort analysis also solves the seasonality problem. A November cohort will always look worse than a March cohort in raw terms because holiday gift buyers do not behave like organic customers. Comparing November Month 1 to March Month 1 is unfair. Comparing November Month 1 to November Month 2 is precise. The cohort table keeps each cohort's timeline separate, so seasonality does not distort the signal.

Why cohort analysis matters for ecommerce operators

Ecommerce businesses live or die by unit economics. CAC, LTV, and payback period are the three numbers that determine whether a brand can scale profitably. All three depend on retention. A brand with a 12-month LTV of $180 and a CAC of $60 has a 3:1 ratio and a healthy business. If retention drops 20%, LTV falls to $144, the ratio drops to 2.4:1, and the brand is now burning cash on every new customer.

The problem is that aggregate retention metrics lag. By the time your blended repeat purchase rate has dropped, the damage was done two or three cohorts ago. Cohort analysis catches the drop in real time because you are comparing each new cohort against the previous one at the same elapsed time. The March cohort's Month 2 retention is directly comparable to the February cohort's Month 2 retention. If it is lower, you know within 60 days instead of waiting for the annual number.

Key insight

Aggregate metrics report the past. Cohort analysis predicts the future. A declining Month 3 retention curve today means lower LTV next quarter, which means higher CAC payback, which means less capital for growth. The operator who sees the curve shift in Month 2 has 90 days to fix the problem before it hits the P&L.

Three specific decisions become clearer with cohort analysis. First, marketing budget allocation. If the Meta cohort retains at 35% at Month 3 while the Google Ads cohort retains at 22%, you have a channel-quality signal that ROAS alone cannot provide. Second, product changes. If a packaging redesign shipped in March and the April cohort's Month 1 retention dropped 8 points, you have a causal link that A/B testing might miss. Third, customer success investment. If Month 2 is consistently the steepest drop across all cohorts, that is where an intervention — a follow-up email, a loyalty offer, a replenishment reminder — will have the highest return.

The data you need to build a cohort table

Cohort analysis requires less data than most operators assume. Three columns are sufficient to start.

  • Customer ID. A unique identifier for each customer. In Shopify this is the customer ID. In Stripe it is the customer object ID. The identifier must be stable across transactions. Email address works as a fallback but breaks when customers use different emails for different orders.
  • Transaction date. The date the order was placed, not shipped or delivered. Use the order creation timestamp from your e-commerce platform. Time zone consistency matters. If your store is in Eastern time and your warehouse system is in Pacific, pick one and stick to it.
  • Transaction amount. The order total in your reporting currency. Use the subtotal before shipping and tax if you are analyzing product retention. Use the full order total including shipping if you are analyzing revenue retention. Be consistent within each table.

You do not need advanced tools to start. Export a CSV from Shopify, Stripe, or your e-commerce platform with these three columns. A spreadsheet can produce the first cohort table in under 30 minutes. The goal is not perfect tooling. The goal is to start seeing the pattern.

Two optional fields add precision. First, acquisition channel. Tagging each customer with the UTM source of their first order lets you compare cohort quality by channel. Second, product SKU. Tracking which product a customer bought first lets you compare cohort quality by entry product. Both are valuable but not required for the first version.

How to build a cohort analysis table: step by step

Building a cohort table is a four-step process. Each step must be done in order. Skipping a step or doing them out of order is what produces the wrong answer.

Step 1: Assign every customer to a cohort

A customer's cohort is defined by their first transaction date, rounded to the period you are analyzing. For most ecommerce brands, monthly cohorts are the right granularity. Weekly cohorts create too much noise for low-volume businesses. Quarterly cohorts smooth away too much signal for high-volume businesses.

To assign cohorts, find the earliest transaction date for each customer ID. Round that date to the first day of the month. That month is the customer's cohort. Every customer belongs to exactly one cohort. A customer who first bought on January 15 is in the January cohort. A customer who first bought on January 3 is also in the January cohort. The exact day within the month does not matter for monthly cohorts.

Step 2: Calculate elapsed periods for every transaction

For every transaction in your dataset, calculate how many periods have elapsed since the customer's cohort date. A customer in the January cohort who buys again in January has an elapsed period of 0. A customer in the January cohort who buys again in February has an elapsed period of 1. A customer in the January cohort who buys again in March has an elapsed period of 2.

The formula is simple: elapsed periods equals the transaction month minus the cohort month. In spreadsheet terms, this is the number of months between the two dates. In SQL, it is a DATEDIFF function. In Python, it is a subtraction of period objects. The exact implementation depends on your tool, but the logic is identical.

Step 3: Count unique active customers per cohort per period

This is the step where most first-time builders make their first error. You must count unique customers, not transactions. If a customer from the January cohort buys three times in February, they count as one active customer in Period 1, not three. Counting transactions instead of customers overstates retention by 20-40% in most ecommerce datasets.

The output of this step is a table with three columns: cohort month, elapsed period, and unique active customers. For example:

Cohort monthElapsed periodUnique active customers
Jan 202501,000
Jan 20251420
Jan 20252280
Feb 202501,200
Feb 20251456
Feb 20252300

Step 4: Calculate retention percentages

The final step is normalization. Divide the unique active customers in each period by the original cohort size. The original cohort size is the count of unique customers in Period 0 for that cohort. This number becomes the denominator for every subsequent period in that row.

The formula for each cell is: retention percentage equals unique active customers in this period divided by original cohort size, multiplied by 100. The result is a percentage that allows direct comparison across cohorts of different sizes.

The final cohort table looks like this:

CohortSizeM0M1M2M3M6M12
Jan 20251,000100%42%28%22%15%8%
Feb 20251,200100%38%25%20%13%--
Mar 2025900100%45%30%24%----
Apr 20251,100100%40%27%------
May 20251,050100%44%--------

The table is triangular because earlier cohorts have had more time to accumulate data. The January cohort has 12 months of history. The May cohort has only one. This is correct. Do not force the table to be rectangular by excluding data from older cohorts.

How to read a cohort table

Reading a cohort table requires looking at three dimensions: horizontal trends within a cohort, vertical trends across cohorts, and diagonal trends that reveal seasonality.

Horizontal trends show how a single cohort ages. Read left to right across any row. The January cohort starts at 100% in Month 0, drops to 42% in Month 1, 28% in Month 2, and continues declining. The shape of this curve is your retention signature. A steep drop in Month 1 followed by a flattening curve is normal for ecommerce. A steep drop in Month 3 that was not there in older cohorts is a warning signal.

Vertical trends show whether newer cohorts are better or worse than older ones. Read down any column. Compare the Month 1 column: January is 42%, February is 38%, March is 45%, April is 40%, May is 44%. The February cohort underperformed. The March cohort overperformed. The May cohort is back on track. This vertical comparison is the fastest way to spot whether your retention is improving or degrading.

Diagonal trends reveal seasonality. Compare a cell to the cell one row down and one column right. These represent the same calendar month for different cohorts. If November Month 1 is consistently lower than March Month 1 across multiple years, you have a seasonal effect. Holiday gift buyers behave differently than organic shoppers. The cohort table makes this visible without requiring a separate seasonal adjustment.

Key insight

The most important question a cohort table answers is not "what is our retention rate?" It is "is our retention rate getting better or worse?" The absolute number matters for benchmarking. The trend matters for survival.

Ecommerce cohort retention benchmarks by vertical

Retention curves vary dramatically by product category. A supplement brand with auto-ship subscriptions will retain customers at rates a furniture brand cannot match. The table below shows typical 12-month cohort retention by vertical, drawn from composite D2C operator reports and platform benchmark studies.

VerticalM1 retentionM3 retentionM6 retentionM12 retentionPrimary driver
Subscription ecommerce75-85%55-70%50-60%45-65%Auto-ship commitment
Food and beverage40-55%30-40%22-30%30-45%Replenishment need
Health and supplements40-55%28-38%20-28%30-45%Habit formation
Beauty and personal care35-50%24-34%16-24%25-40%Product discovery
Fashion and apparel25-40%18-26%12-18%18-30%Style turnover
Home and furniture15-25%10-16%7-12%8-15%Purchase cycle length
Pet products35-50%26-36%18-26%28-40%Replenishment + loyalty

Use these ranges as directional guides, not targets. A beauty brand at 45% M12 retention is either in a premium niche with genuine loyalty, or its cohort definition is too narrow. A furniture brand at 5% M12 retention is either selling one-and-done items, or its post-purchase experience is failing to create repeat demand.

Cited statistic

The average D2C ecommerce retention rate across all verticals is 31% at 12 months, according to 2026 composite data from Propel AI and industry benchmark studies. Transactional ecommerce brands average 25-40% annual retention, while subscription models reach 45-65%. The gap between subscription and transactional is not a matter of marketing quality. It is a matter of business model mechanics.

The five mistakes that break cohort tables

These are the errors we see most often when operators build their first cohort tables. Each one produces a table that looks correct but tells the wrong story.

Mistake 1: Counting transactions instead of unique customers

This is the most common error. A customer who buys three times in February counts as three transactions but only one retained customer. Counting transactions overstates retention by 20-40% in most datasets. The fix is simple: use a COUNT DISTINCT on customer ID, not a COUNT on transaction ID.

Mistake 2: Using calendar months instead of rolling periods

A customer who buys on January 31 and again on February 1 has an elapsed period of 1 month in a calendar-month system, even though only one day has passed. In a rolling 30-day system, that same customer has an elapsed period of 0. The calendar-month approach creates artificial spikes and drops at month boundaries. Use rolling periods for accuracy.

Mistake 3: Including customers with no transaction history

Some platforms create customer records for email subscribers or cart abandoners who never completed a purchase. Including these in your cohort table inflates the denominator and depresses retention percentages. Filter to customers with at least one completed transaction before assigning cohorts.

Mistake 4: Comparing cohorts of different sizes without normalizing

A cohort of 100 customers with 30 retained in Month 1 has the same retention rate as a cohort of 10,000 customers with 3,000 retained. The absolute numbers are different but the percentage is identical. Always normalize to percentages before comparing cohorts. Raw counts are useful for planning but misleading for analysis.

Mistake 5: Ignoring seasonality in interpretation

November and December cohorts will almost always show lower retention than March and April cohorts. This is not a product failure. It is a gift-buyer effect. Comparing a December cohort's Month 1 to a March cohort's Month 1 is comparing two different customer types. Compare December Month 1 to December Month 2, or compare December to the previous December, not to March.

From cohort analysis to action

A cohort table is not a report. It is a diagnostic tool. The operator who stops at building the table has done half the work. The operator who turns the table into action has done the work that matters.

Three actions flow directly from cohort analysis. First, channel reallocation. If your Meta cohorts retain at 35% at Month 3 while your TikTok cohorts retain at 18%, you have evidence that Meta customers are higher quality. This does not mean you should abandon TikTok. It means you should calculate the fully loaded CAC and LTV for each channel and reallocate budget toward the channel with the higher LTV:CAC ratio, not the lower upfront CAC.

Second, intervention timing. If Month 2 is consistently the steepest drop across all cohorts, that is when a retention campaign will have the highest impact. A replenishment reminder, a loyalty discount, or a product education email sent at Day 45 can flatten the curve. Test one intervention per cohort and measure the lift.

Third, product decisions. If customers who bought Product A as their first purchase retain at 45% at Month 3 while customers who bought Product B retain at 22%, you have a product-market fit signal. Product A is a better entry point. Consider featuring Product A more prominently in ads, on landing pages, and in email flows. Product B may still be valuable as an upsell, but it is not the right first impression.

How Fairview handles cohort analysis

Fairview connects to Shopify, Stripe, and your other data sources and builds cohort tables automatically. It assigns each customer to a monthly cohort based on their first transaction date, tracks retention by elapsed period, and normalizes to percentages without manual calculation.

The Cohort LTV Tracker goes further. It layers revenue data on top of the retention table, so you see not just what percentage of customers returned but how much revenue they generated in each period. This produces a revenue retention curve that often differs from customer retention. If your retained customers spend more over time, revenue retention will exceed customer retention. If they spend less, the reverse is true.

Fairview also surfaces the comparison that matters most: whether newer cohorts are retaining better or worse than older cohorts at the same elapsed time. When a newer cohort underperforms, Fairview flags it as an anomaly and recommends the specific action to investigate. The Operating Dashboard surfaces cohort trends alongside margin, pipeline, and forecast data, so the operator sees the full picture in one view.

Key takeaways

  • Cohort analysis groups customers by first purchase month and tracks retention over time. It is the only format that reveals whether retention is improving or degrading.
  • You need three data points: customer ID, transaction date, and transaction amount. A spreadsheet with these columns is sufficient for the first version.
  • Build the table in four steps: assign cohorts, calculate elapsed periods, count unique active customers, and normalize to percentages.
  • Read the table horizontally for cohort aging, vertically for cohort comparison, and diagonally for seasonality.
  • Turn the analysis into action by reallocating budget toward higher-retention channels, timing interventions at the steepest drop, and using entry-product retention as a product-market fit signal.

If you are looking to build cohort analysis into your weekly operating review, book a demo to see how Fairview automates cohort tables, tracks LTV by acquisition channel, and surfaces retention anomalies before they hit your P&L.

+

Building a cohort analysis table requires four steps. First, assign every customer to a cohort based on their first purchase date, typically rounded to the month. Second, for every subsequent order, calculate how many periods have elapsed since the customer's cohort date. Third, count the unique customers from each cohort who made a purchase in each elapsed period. Fourth, divide each period's count by the original cohort size to get retention percentages. The result is a triangular matrix where earlier cohorts have more columns filled in than recent ones.

+

Good ecommerce cohort retention varies by vertical. For food and beverage, 30-45% of customers return within 12 months. For beauty and personal care, 25-40% return. For fashion and apparel, 18-30% return. For home and furniture, 8-15% return. Subscription ecommerce performs better at 45-65% annual retention. The key benchmark is not the absolute number but the trend: newer cohorts should retain at equal or better rates than older cohorts at the same elapsed time. If Month 3 retention for the March cohort is lower than Month 3 retention for the January cohort, something has degraded.

+

You need three data points at minimum: a unique customer identifier, the date of each transaction, and the transaction amount. From these you derive the cohort date for each customer, which is their earliest transaction date. You also need to decide what counts as activity: a purchase, a login, or a revenue threshold. For ecommerce, a purchase is the standard activity metric. You do not need advanced tools to start. A spreadsheet with customer ID, order date, and order value is sufficient for the first version.

+

The five most common mistakes are: one, counting transactions instead of unique customers, which overstates retention when one customer buys twice in a period. Two, using calendar months instead of rolling periods, which creates artificial spikes around month-end. Three, including customers with no transaction history, which inflates the denominator. Four, comparing cohorts of different sizes without normalizing to percentages. Five, ignoring seasonality, which makes November and December cohorts look artificially weak because gift buyers do not return at normal rates.

Siddharth Gangal is Founder at Fairview. He has spent 12 years building operating systems for revenue teams and advises D2C operators on metrics, forecasting, and margin management.

Fairview · Free for 14 days

Turn this into action — automatically.

Connect your CRM, finance, and ad data. Fairview surfaces margin leaks, pipeline risk, and next-best actions every week.

No credit card · Setup in under 10 minutes

Frequently asked questions

+

Cohort analysis in ecommerce is the practice of grouping customers by the period of their first purchase and tracking how many of them make subsequent purchases over time. Each row of a cohort table represents one acquisition cohort. Each column represents a period elapsed since acquisition. The cells show either the count or percentage of customers from that cohort who purchased again in that period. This format reveals whether newer cohorts retain better than older ones, which is the signal that tells you if your product, marketing, or customer experience is improving or degrading.

Stop reading. Start making decisions.

Connect your stack, see your operating picture, act on what matters. First source live in 10 minutes.