TL;DR
- A complete lead scoring model runs on four dimensions: demographic fit, firmographic fit, behavioral engagement, and negative signals — each with explicit weights that sum to 100 points.
- MQL threshold belongs at the score that captures the top 15–20 percent of your database and produces an MQL-to-SQL acceptance rate above 60 percent. For most B2B teams this falls between 60 and 75 points.
- Behavioral scores must time-decay: halve them after 30 days of inactivity, and again at 60 days. Without decay, stale leads accumulate and sales trust erodes.
- Rule-based scoring outperforms poorly trained predictive models. Predictive scoring requires at least 1,000 lead records and 200 closed-won deals before it produces reliable signal.
- The model needs a quarterly calibration loop tied to rejected MQL reason codes from sales — not just marketing's intuition about what a good lead looks like.
Most lead scoring models fail for the same reason: they were built once by marketing, handed to sales without a feedback mechanism, and never touched again. Score inflation sets in. The MQL queue fills with leads who opened one email eight months ago. Sales starts ignoring the queue entirely. The model becomes a reporting artifact rather than an operating mechanism.
This post is a working template — attribute tables, weights, point values, threshold logic, and a validation framework — not a conceptual overview. If you already understand what lead scoring is and want the mechanics, start at the template section. If you want context on model type selection first, the next section covers rule-based versus predictive scoring and when each applies.
Rule-Based vs. Predictive Lead Scoring: Which Model Type to Build
The choice between rule-based and predictive scoring is primarily a data availability question, not a sophistication question. Predictive scoring is not inherently better — a well-calibrated rule-based model consistently outperforms a poorly trained predictive model, and a poorly trained predictive model generates confident-sounding garbage.
Rule-based scoring
Rule-based scoring assigns fixed point values to attributes and behaviors you define explicitly. A VP title is worth 20 points. A demo request is worth 40 points. A visit to the pricing page is worth 15 points. The model is fully transparent — sales can see exactly why a lead has a given score — and it is deployable in two to four weeks without historical data requirements.
The limitation is that rule-based scoring cannot discover patterns you did not think to look for. If visits to your integration documentation page are a strong conversion predictor, you will only capture that signal if someone thought to include it. Rule-based models also require manual updates as your ICP and product evolve.
Predictive scoring
Predictive scoring trains a machine learning model on your historical closed-won and closed-lost deal data to surface patterns that correlation analysis would miss — which firmographic combinations convert at 3x the base rate, which behavioral sequences reliably precede purchase, which signals look like engagement but have no conversion relationship.
The data requirements are non-negotiable: at minimum 1,000 lead records, 200 closed-won deals, and 200 closed-lost deals with consistent field population in your CRM. Below that threshold, the model overfits to noise. Initial setup takes four to eight weeks, plus 60 to 90 days of live operation before the predictions are reliable enough to drive routing decisions. HubSpot's predictive scoring and Marketo's AI scoring tier both require comparable data volumes before they produce calibrated output.
Which to build now
Start with rule-based scoring if: you have fewer than 500 closed deals in your CRM, your team has no prior scoring model (the feedback loop does not exist yet), or you are entering a new segment where historical patterns may not transfer. Build the rule-based model, run it for two to three quarters, collect rejection reason codes from sales, and use that data to either refine the rules or seed a predictive model with cleaner training data. Marketo's standard guidance is to run any new model in parallel with the existing one for at least one full sales cycle before switching — the same principle applies to the rule-to-predictive transition.
The Four Dimensions of a Complete Scoring Model
Every effective B2B lead scoring model answers two questions simultaneously. First: can this person realistically buy? That is the fit question — answered by demographic and firmographic attributes. Second: are they interested right now? That is the intent question — answered by behavioral engagement signals. Negative scoring is a third layer that removes false positives both systems generate.
Dimension 1: Demographic fit (person-level)
Demographic scoring evaluates whether the individual contact has the authority, role, and context to champion or approve a purchase. This dimension covers job title, seniority, department, and role function. A Chief Revenue Officer in a target industry is a materially different lead than a marketing coordinator at the same company — regardless of what content they have consumed.
Dimension 2: Firmographic fit (company-level)
Firmographic scoring evaluates whether the company matches your ICP on the dimensions that correlate with deal closure: employee count, revenue range, industry vertical, technology stack, and geography. A company that is outside your target employee headcount band — regardless of how engaged the individual contact is — will likely stall at procurement or produce undersized deals that churn quickly.
Dimension 3: Behavioral engagement (intent signals)
Behavioral scoring tracks what the lead has done across your owned channels: web pages visited, content downloaded, emails opened and clicked, webinars attended, free trial activity, demo requests, and pricing page visits. Not all behaviors carry equal weight. Pricing page visits and demo requests are high-intent signals. Blog post views and email opens are low-intent signals. The scoring model must reflect that hierarchy explicitly, or engagement volume will dominate intent quality.
Dimension 4: Negative signals (disqualification and decay)
Negative scoring deducts points for attributes and behaviors that indicate a lead is not a real buyer or has gone cold. This dimension is the most commonly skipped and the fastest way to degrade model quality. Without negative scoring, score inflation is inevitable: a lead who engaged briefly nine months ago will accumulate points over time and surface as an MQL long after any genuine intent window has closed.
Lead Scoring Model Template: Full Attribute and Weight Tables
The tables below constitute a working 100-point scoring model. The point allocations reflect standard B2B SaaS weighting for a sales-led motion with ACV between $15K and $100K. For PLG or high-velocity SMB motions, shift more weight toward behavioral and product usage signals. For enterprise deals above $100K ACV, increase firmographic and demographic weight and add a separate MEDDIC qualification layer before SQL handoff.
Demographic and firmographic attributes sum to a maximum of 50 points. Behavioral engagement sums to a maximum of 65 points. Negative signals carry uncapped deductions. The effective ceiling before deductions is 115 points, but no realistic lead scores above 100 before negative adjustments bring the distribution in line with the 100-point scale. Adjust point values to match your ICP — the architecture and relative weighting are what matters, not the absolute numbers.
Table 1: Demographic Fit (max 25 points)
| Attribute | Value / Condition | Points |
|---|---|---|
| Job Title / Seniority | C-suite, VP, or Owner (direct buyer or champion) | +20 |
| Job Title / Seniority | Director or Senior Manager (influencer or champion) | +15 |
| Job Title / Seniority | Manager or Lead (evaluator, limited authority) | +8 |
| Job Title / Seniority | Individual Contributor (end user, no budget authority) | +3 |
| Department | RevOps, Sales, Marketing, Finance, Operations | +5 |
| Department | Engineering, Product, HR (adjacent, not primary buyer) | +2 |
| Email Domain Type | Corporate domain (company email address) | +0 |
| Email Domain Type | Personal domain: Gmail, Yahoo, Hotmail, etc. | −20 |
Table 2: Firmographic Fit (max 25 points)
| Attribute | Value / Condition | Points |
|---|---|---|
| Company Size (employees) | 50–500 (core ICP band) | +15 |
| Company Size (employees) | 501–2,000 (adjacent, slightly larger) | +10 |
| Company Size (employees) | 20–49 (smaller than ideal, but possible) | +5 |
| Company Size (employees) | Under 20 or above 2,000 | +0 |
| Industry Vertical | Primary target verticals (e.g., B2B SaaS, professional services) | +10 |
| Industry Vertical | Adjacent verticals with documented wins | +5 |
| Industry Vertical | Excluded verticals (government, nonprofit, education) | −25 |
| Technology Stack Signal | Known tech stack match (Salesforce, HubSpot, or target complementary tools) | +5 |
| Geography | Primary served regions (e.g., US, Canada, UK, Australia) | +3 |
| Competitor Domain | Email domain matches known direct competitor | −50 |
Table 3: Behavioral Engagement (max 65 points, subject to time decay)
| Behavior | Condition | Points | Intent Level |
|---|---|---|---|
| Demo / Trial Request | Submitted demo request or signed up for free trial | +40 | High |
| Pricing Page Visit | Visited /pricing in the last 14 days | +20 | High |
| Pricing Page Visit | Visited /pricing, 15–60 days ago | +10 | High |
| Contact / Sales Page Visit | Visited /contact or /talk-to-sales in last 30 days | +15 | High |
| High-Value Content Download | Downloaded ROI calculator, buyer guide, or comparison report | +15 | Medium |
| Webinar Attendance | Attended live or watched on-demand in last 30 days | +12 | Medium |
| Product / Features Page Visit | Visited 3 or more product or feature pages in a single session | +10 | Medium |
| Case Study / ROI Content | Viewed 2 or more customer stories or ROI pages | +8 | Medium |
| Email Click-Through | Clicked a link in a marketing email (not unsubscribe) | +5 | Low |
| Blog / Top-of-Funnel Visit | Visited blog posts or resource pages only | +2 | Low |
| Email Open (only) | Opened email but did not click; no site visit | +1 | Low |
| Email Unsubscribe | Unsubscribed from any marketing email | −30 | Disqualify |
Table 4: Negative Signals and Disqualifiers
| Signal | Condition | Points |
|---|---|---|
| Personal Email Domain | Gmail, Yahoo, Hotmail, or other free-tier provider | −20 |
| Competitor Domain | Email domain matches known direct competitor | −50 |
| Non-Buyer Job Title | Student, intern, consultant at competitor, researcher | −30 |
| Excluded Industry | Government, nonprofit, K-12 education, or other excluded verticals | −25 |
| Extended Inactivity | No behavioral signal of any kind in 90+ days | −15 |
| Bounced Email | Hard bounce on any email address | −25 |
| Previously Disqualified | Sales rejected this lead with "not a fit" reason code in prior cycle | −40 |
Time Decay: How to Prevent Score Inflation
Time decay is the mechanism that prevents your lead queue from filling with stale engagement. Without it, a lead who downloaded your buyer guide nine months ago and has been dormant since will retain those 15 behavioral points indefinitely, eventually crossing the MQL threshold on the strength of old engagement alone.
Apply decay only to behavioral scores, not to demographic or firmographic attributes. A lead's job title and company size do not expire. Their intent signals do.
Standard decay schedule
| Time Since Last Behavioral Signal | Score Multiplier Applied to Behavioral Points |
|---|---|
| 0–14 days | 100% (full value) |
| 15–30 days | 75% |
| 31–60 days | 50% |
| 61–90 days | 25% |
| 91+ days | 0% (behavioral score reset to 0; apply −15 inactivity deduction) |
Implementing decay in HubSpot or Marketo requires either a scheduled workflow that recalculates scores on a defined cadence, or a timestamp-based scoring property that stores the date of each behavioral event and computes the decayed value dynamically. Most teams use the scheduled workflow approach because it is simpler to audit. Run decay calculations weekly at minimum — daily for high-volume pipelines.
A practical signal that decay is working: the distribution of lead scores should shift meaningfully within 60 to 90 days of implementation as old engagement ages out. If the distribution does not shift, the decay logic is not firing correctly.
Setting the MQL Threshold: Methodology, Not Intuition
The MQL threshold is the score at which a lead is automatically routed to sales for follow-up. Set it too low and sales receives a high volume of weak leads, degrades trust in the scoring model, and stops working the queue. Set it too high and real pipeline sits unworked until the intent window closes.
The threshold-setting methodology has three steps.
Step 1: Establish your baseline conversion rate
Pull all leads created in the past 12 months. Calculate what percentage of all leads converted to a closed-won opportunity, regardless of their score. This is your base rate. For most mid-market B2B teams this is 1 to 5 percent. Now segment that same population by score decile and calculate the conversion rate within each decile. If your model is working, conversion rate should rise monotonically as score increases. If it does not, your scoring attributes are not actually predictive of conversion and the weights need rework before you set a threshold.
Step 2: Find the conversion inflection point
Identify the score decile where conversion rate first meaningfully exceeds the base rate — typically 2x or more. That inflection point is the candidate MQL threshold. For most B2B models on a 100-point scale, this falls between 60 and 75 points and captures the top 15 to 20 percent of leads by volume. If fewer than 10 percent of leads reach that threshold, it is too high. If more than 30 percent reach it, it is too low and sales will be overwhelmed with low-quality volume.
Step 3: Validate with sales before go-live
Before activating automated MQL routing, share a sample of 20 to 30 leads that would qualify under the proposed threshold with two or three sales reps. Ask them to rate each lead as "would call immediately," "would call this week," or "would not prioritize." If fewer than 60 percent of the sample falls into the first two categories, either the threshold needs to rise or specific attributes need re-weighting. This step surfaces the definition gap between marketing's model and sales' judgment before it shows up as rejection rates in production.
After go-live, the target acceptance rate — the percentage of MQLs that sales accepts within 48 hours — is 60 percent or above. Below that threshold, the model is generating false positives. Above 90 percent, the threshold is too conservative and likely leaving real pipeline in the queue.
Validating and Calibrating the Model Over Time
A scoring model that is not actively maintained decays faster than the leads it is supposed to evaluate. The validation framework below runs on a quarterly cadence and uses three data inputs: conversion metrics, rejected MQL reason codes, and cohort analysis.
The three health metrics
| Metric | Healthy Range | Action if Out of Range |
|---|---|---|
| MQL-to-SQL Acceptance Rate | 60–90% | Below 60%: raise threshold or re-weight false-positive attributes. Above 90%: lower threshold — you are missing real pipeline. |
| SQL-to-Opportunity Conversion Rate | > 30% | Below 30%: scoring is passing leads that fail during discovery. Audit the fit criteria and add disqualifiers for segments with low discovery-to-opportunity conversion. |
| Time MQL → First Sales Contact | < 4 hours | Above 4 hours: a routing or SLA problem, not a scoring problem. Address the handoff process, not the model. |
Rejected MQL reason codes
Every MQL rejection must carry a reason code. Without structured rejection feedback, you cannot distinguish between a threshold calibration problem, a weight problem, and a data quality problem. Require sales to select from a defined list when rejecting an MQL: wrong company size, wrong industry, wrong title, no budget authority, already a customer, already in an open opportunity, or no recent engagement. Analyze rejection codes quarterly. If "wrong company size" appears in more than 20 percent of rejections, your firmographic scoring is either miscalibrated or your ICP definition has shifted.
Cohort conversion analysis
Each quarter, run a cohort analysis comparing the closed-won rate of leads who crossed the MQL threshold against leads who scored between 10 and 15 points below the threshold. If the conversion rate difference is not statistically significant, either the threshold is in the wrong place or the model's predictive power is weak. Conversely, if leads just above the threshold convert at 3x the base rate and leads at the threshold's midpoint convert at 5x, the threshold may be set too conservatively and you are leaving high-value leads in a non-priority queue.
Marketo's standard guidance applies here: run any model changes — threshold adjustments, weight changes, new attributes — in parallel with the current model for one full sales cycle before fully switching. A full sales cycle means from lead creation through closed-won, not just through MQL handoff. For a typical B2B mid-market motion, that is 60 to 90 days.
Fit vs. Intent: Adjusting Weight Distribution for Your Sales Motion
The template above allocates 50 points to fit (demographic + firmographic) and 65 points to behavioral intent before negative scoring, weighting intent slightly higher. This reflects a typical B2B SaaS inbound motion where behavioral signals are abundant and reliably captured through marketing automation. Adjust the balance based on your motion type.
Sales-led outbound at high ACV (> $50K): Increase fit weight to 60 to 70 percent of total positive points. Your sales team generates their own intent signals through outbound sequences; scoring should primarily validate that the company and contact are worth the outbound investment. Consider adding a separate intent data tier using third-party signals from tools like Bombora or G2 if first-party behavioral signals are sparse.
PLG or product-led motion: Shift 60 to 70 percent of weight to behavioral and product usage signals. Free trial activation rate, feature adoption milestones, session frequency, and team seat expansion are the most predictive signals in a PLG model. Fit matters less because product usage itself is evidence of a real buyer who has already invested time in your product.
High-velocity SMB (< $10K ACV): Compress the model. A 50-point scale is sufficient. The sales cycle is short enough that intent signals are the dominant variable — whether the lead has requested a demo or visited pricing in the last seven days matters far more than their exact company size. Simplify to keep the model maintainable without a dedicated ops resource.