Sales reps spend roughly 30% of their time actually selling, according to Salesforce's State of Sales report. The rest goes to admin work, internal meetings, and researching leads that may not be worth pursuing. A lead scoring model is supposed to fix this by telling reps which leads deserve their attention first.
In practice, most scoring models underperform. The reason is usually the same: the data feeding them is incomplete. A lead fills out a form with their email address and first name. The scoring model needs seniority level, company size, industry, and job function to produce a useful score. Without enrichment, those fields are empty, and the model defaults to treating every lead the same.
This article walks through how to build a lead scoring model that uses enrichment data as its foundation. Not a conceptual overview (there are plenty of those), but the actual implementation: choosing which fields to score, calibrating point values from your conversion data, setting thresholds, and keeping the model accurate over time.
If you haven't set up an enrichment pipeline yet, start with Building a Real-Time Enrichment Pipeline with a Person API. That covers the plumbing. This article covers what you do with the data once it's flowing.
Why Enrichment Is the Prerequisite
Lead scoring without enrichment is scoring in the dark. According to MarketingSherpa's B2B Marketing Benchmark Report, 61% of B2B marketers send all leads directly to sales without scoring or qualifying them. Only 27% of those leads turn out to be qualified. The result is a sales team buried in leads they can't close.
The fix seems obvious: score the leads first. But scoring requires data, and most inbound leads arrive sparse. A typical web form collects an email, maybe a name, maybe a company. That's not enough to determine whether someone is a VP at a 500-person SaaS company or an intern at a startup.
Enrichment fills those gaps at the moment the lead enters your system. An email address goes into the person enrichment API; a full professional profile comes back: job title, seniority level, company name, company size, industry, job function, location, years of experience. For account-level signals, a company enrichment API adds firmographic depth: employee count, industry, company size, growth trends. Now the scoring model has something to work with at both the contact and account level.
import os
import requests
def enrich_lead(email):
"""Enrich a lead and return scoring-relevant fields."""
response = requests.post(
"https://api.datalegion.ai/person/enrich",
headers={"API-Key": os.environ["DATA_LEGION_API_KEY"]},
json={"email": email},
)
if response.status_code == 404:
return None
response.raise_for_status()
data = response.json()
matches = data.get("matches", [])
if not matches:
return None
person = matches[0]["person"]
return {
"seniority_level": person.get("seniority_level"),
"job_function": person.get("job_function"),
"company_size": person.get("company_size"),
"company_industry": person.get("company_industry"),
"years_of_experience": person.get("years_of_experience"),
}Five fields. That's the minimum set that drives most B2B lead scoring. Everything else (behavioral signals, engagement data, technographics) layers on top.
Defining Your Ideal Customer Profile First
Before assigning point values, you need to know what a good lead looks like for your business. This is your Ideal Customer Profile: the combination of attributes shared by your best customers.
Pull your closed-won deals from the last 12 months. For each deal, look at the contact who entered the pipeline first and note their:
- Seniority level (C-suite, VP, Director, Manager, Individual Contributor)
- Job function (Engineering, Sales, Marketing, Operations, etc.)
- Company size (by employee count band)
- Industry
- Years of experience
You're looking for concentrations. If 60% of your closed-won deals came from Director+ contacts at companies with 200-2,000 employees in SaaS or financial services, that's your ICP. The scoring model should reflect those patterns, not your assumptions about who should be buying.
# Example: Analyzing closed-won deals to find ICP patterns
from collections import Counter
def analyze_closed_won(deals):
"""Count attribute frequencies across closed-won deals."""
seniority_counts = Counter(d["seniority_level"] for d in deals if d.get("seniority_level"))
size_counts = Counter(d["company_size"] for d in deals if d.get("company_size"))
industry_counts = Counter(d["company_industry"] for d in deals if d.get("company_industry"))
function_counts = Counter(d["job_function"] for d in deals if d.get("job_function"))
return {
"seniority": seniority_counts.most_common(5),
"company_size": size_counts.most_common(5),
"industry": industry_counts.most_common(10),
"job_function": function_counts.most_common(5),
}Run this against your CRM export. The output tells you which attribute values to reward in your scoring model and, just as importantly, which ones to ignore.
Building the Scoring Model
A point-based scoring model assigns numeric values to lead attributes. Each attribute-value pair gets a score. The total determines whether the lead is worth pursuing.
Here's a scoring function built from the ICP analysis above:
# Scoring rules calibrated from closed-won data
SCORING_RULES = {
"seniority_level": {
"c_level": 30,
"vp": 25,
"director": 20,
"manager": 10,
"senior": 5,
},
"company_size": {
"201-500": 20,
"501-1000": 20,
"1001-5000": 15,
"51-200": 10,
"5001-10000": 10,
},
"company_industry": {
"technology, information and internet": 15,
"financial services": 15,
"software development": 15,
"hospitals and health care": 10,
},
"job_function": {
"engineering": 15,
"sales": 15,
"marketing": 10,
"operations": 10,
},
}
# Bonus for years of experience
EXPERIENCE_BRACKETS = [
(10, 15), # 10+ years
(5, 10), # 5-9 years
(2, 5), # 2-4 years
]
def score_lead(enriched_data):
"""Score a lead based on enriched attributes."""
if not enriched_data:
return 0
total = 0
for field, values in SCORING_RULES.items():
lead_value = enriched_data.get(field, "")
if lead_value and lead_value in values:
total += values[lead_value]
yoe = enriched_data.get("years_of_experience")
if yoe is not None:
for min_years, points in EXPERIENCE_BRACKETS:
if yoe >= min_years:
total += points
break
return totalA few things to notice about this model.
The point values aren't arbitrary. They reflect the relative importance of each attribute in your closed-won deals. If 70% of your wins came from Director+ contacts, seniority gets the highest weight. If company size matters less (you close deals across all sizes), its maximum score is lower.
Not every value scores. An individual contributor at a 10-person company in an unrelated industry scores 0. That's by design. The model should produce a wide range of scores, with clear separation between good and bad leads.
Experience acts as a bonus, not a primary signal. Years of experience correlates with seniority but isn't redundant. A senior individual contributor with 15 years of experience is a different lead than one with 2 years, even if they share the same seniority label.
Calibrating Point Values from Conversion Data
The scoring rules above are a starting point. To calibrate them properly, you need to look at how each attribute correlates with actual conversions.
The process:
- Export all leads from the last 6-12 months with their enrichment data and outcome (converted vs. didn't convert).
- For each attribute value, calculate the conversion rate.
- Assign points proportional to how much each value lifts conversion above your baseline.
def calculate_conversion_rates(leads):
"""Calculate conversion rate per attribute value."""
rates = {}
for field in ["seniority_level", "company_size", "company_industry", "job_function"]:
value_stats = {}
for lead in leads:
value = lead.get(field)
if not value:
continue
if value not in value_stats:
value_stats[value] = {"total": 0, "converted": 0}
value_stats[value]["total"] += 1
if lead.get("converted"):
value_stats[value]["converted"] += 1
rates[field] = {
value: stats["converted"] / stats["total"]
for value, stats in value_stats.items()
if stats["total"] >= 10 # minimum sample size
}
return ratesIf your baseline conversion rate is 5% and VPs convert at 18%, that 3.6x lift justifies the highest seniority score. If managers convert at 7% (1.4x lift), they get a proportionally smaller score. If individual contributors convert at 2% (below baseline), they get zero or negative points.
This data-driven approach catches things your intuition might miss. Maybe your ICP says "Director+" but the data shows that senior managers at mid-market companies actually convert better than VPs at enterprise companies. The model should reflect what the data says, not what the sales team believes.
Adding Behavioral Scoring
Enrichment data gives you a demographic score: who the person is. Behavioral scoring adds a second dimension: what the person is doing.
Common behavioral signals and their typical weights:
BEHAVIORAL_SCORES = {
"visited_pricing_page": 20,
"downloaded_whitepaper": 10,
"attended_webinar": 15,
"requested_demo": 30,
"visited_docs": 10,
"opened_email_3x": 5,
"visited_site_3_plus_times": 10,
}
def score_lead_combined(enriched_data, behavioral_events):
"""Combine demographic and behavioral scoring."""
demographic_score = score_lead(enriched_data)
behavioral_score = 0
for event in behavioral_events:
event_type = event.get("type")
if event_type in BEHAVIORAL_SCORES:
behavioral_score += BEHAVIORAL_SCORES[event_type]
return {
"demographic": demographic_score,
"behavioral": behavioral_score,
"total": demographic_score + behavioral_score,
}Keeping the two scores separate (even though you sum them for a total) matters for routing. A lead with high demographic score but low behavioral score is a good fit who hasn't engaged yet. That might go into a nurture sequence. A lead with low demographic score but high behavioral score (an IC who visited your pricing page three times and requested a demo) might be a champion who can introduce you to the decision maker.
| Demographic | Behavioral | Routing |
|---|---|---|
| High | High | Fast-track to sales |
| High | Low | Marketing nurture |
| Low | High | Qualify further (could be a champion) |
| Low | Low | Automated sequence or deprioritize |
This two-axis model is more useful than a single combined score because it tells sales not just whether to pursue a lead, but how.
Setting MQL and SQL Thresholds
A score is meaningless without a threshold. The threshold defines when a lead becomes a Marketing Qualified Lead (MQL) and when it becomes a Sales Qualified Lead (SQL) ready for handoff.
MQL-to-SQL conversion rates average 12-21% across industries, with top performers reaching 25-35%. If your conversion rate is significantly below that range, your threshold is likely too low (you're passing unqualified leads to sales) or too high (you're holding back leads that are ready).
To set the initial threshold:
- Score all your historical leads retroactively using the model.
- Split them into score bands (0-20, 21-40, 41-60, 61-80, 81-100).
- Calculate the conversion rate for each band.
- Set the MQL threshold where conversion rate exceeds your baseline by 2x or more.
- Set the SQL threshold where conversion rate exceeds your baseline by 4x or more.
def analyze_score_bands(scored_leads, band_size=20):
"""Analyze conversion rates by score band."""
bands = {}
for lead in scored_leads:
score = lead["score"]
band = (score // band_size) * band_size
band_label = f"{band}-{band + band_size - 1}"
if band_label not in bands:
bands[band_label] = {"total": 0, "converted": 0}
bands[band_label]["total"] += 1
if lead.get("converted"):
bands[band_label]["converted"] += 1
for band_label, stats in sorted(bands.items()):
rate = stats["converted"] / stats["total"] if stats["total"] > 0 else 0
print(f" {band_label}: {rate:.1%} ({stats['converted']}/{stats['total']})")A typical output might look like:
0-19: 2.1% (8/380)
20-39: 4.8% (15/312)
40-59: 11.3% (28/248)
60-79: 22.7% (34/150)
80-99: 38.5% (25/65)In this example, the baseline is around 5%. The 40-59 band shows a 2x+ lift, so that's a reasonable MQL threshold. The 60-79 band shows 4x+ lift, making 60 a good SQL threshold. Leads scoring 80+ are your best opportunities and should be routed to senior reps immediately.
These numbers will be different for every business. The method matters more than the specific values.
Score Decay
A lead scored 85 three months ago is not the same quality as a lead scored 85 today. Contact data decays at roughly 2% per month. People change jobs, companies restructure, roles shift. A VP who was a perfect fit in January may have moved to a different company by April.
Behavioral signals decay even faster. A pricing page visit from last week signals intent. The same visit from six months ago doesn't.
Implement decay by reducing scores over time:
from datetime import datetime
def apply_score_decay(lead):
"""Reduce scores based on data age."""
days_since_enrichment = (datetime.now() - lead["last_enriched"]).days
days_since_activity = (datetime.now() - lead["last_activity"]).days
# Demographic score: decay 10% per month after 90 days
demographic_decay = 1.0
if days_since_enrichment > 90:
months_stale = (days_since_enrichment - 90) / 30
demographic_decay = max(0.3, 1.0 - (months_stale * 0.1))
# Behavioral score: decay 25% per month after 30 days
behavioral_decay = 1.0
if days_since_activity > 30:
months_inactive = (days_since_activity - 30) / 30
behavioral_decay = max(0.1, 1.0 - (months_inactive * 0.25))
return {
"demographic": int(lead["demographic_score"] * demographic_decay),
"behavioral": int(lead["behavioral_score"] * behavioral_decay),
"total": int(
lead["demographic_score"] * demographic_decay
+ lead["behavioral_score"] * behavioral_decay
),
}Two things to note about this approach. Demographic scores decay slower than behavioral scores because professional attributes change less frequently than engagement signals. And neither score decays to zero. A lead who was a strong demographic fit six months ago is still worth re-engaging if they show new activity.
The fix for demographic decay is re-enrichment. Run stale leads (90+ days since last enrichment) through the enrichment API again to refresh their profile data, then rescore. This is the same hygiene pattern described in the GTM engineer's guide, applied specifically to scoring.
Measuring Whether the Model Works
A scoring model that nobody validates will drift. Conversion patterns change. Your product evolves. New competitors shift who responds to your marketing. You need to check whether the scores still predict outcomes.
Track these four metrics monthly:
Conversion rate by score band is the core health check. If your top score band stops converting at a meaningfully higher rate than your middle bands, the model has lost its predictive power.
MQL-to-SQL acceptance rate tells you whether sales agrees with marketing's definition of "qualified." If acceptance is below 50%, your MQL threshold is too low or your scoring criteria don't match what sales considers qualified.
Score distribution reveals whether the model differentiates at all. If 80% of leads score between 40-60, it isn't. A good model produces a spread, with clear clusters at the low end (bad fit), middle (worth nurturing), and high end (ready for sales).
Time to close by score band validates that the model predicts purchase readiness, not just fit. High-scoring leads should close faster. If they don't, the model is measuring something else.
def monthly_model_health(leads_this_month):
"""Run monthly health checks on the scoring model."""
by_band = {}
for lead in leads_this_month:
band = (lead["score"] // 20) * 20
if band not in by_band:
by_band[band] = {"total": 0, "converted": 0, "days_to_close": []}
by_band[band]["total"] += 1
if lead.get("converted"):
by_band[band]["converted"] += 1
if lead.get("days_to_close"):
by_band[band]["days_to_close"].append(lead["days_to_close"])
for band in sorted(by_band):
stats = by_band[band]
rate = stats["converted"] / stats["total"] if stats["total"] > 0 else 0
avg_days = (
sum(stats["days_to_close"]) / len(stats["days_to_close"])
if stats["days_to_close"]
else None
)
print(f" {band}-{band + 19}: {rate:.1%} conversion, "
f"avg close: {avg_days:.0f} days" if avg_days else "")When the numbers drift, recalibrate. Rerun the conversion rate analysis from the "Calibrating Point Values" section using the most recent 6 months of data. Adjust the point values. Move the thresholds. This isn't a failure of the model. It's how scoring models are supposed to work: continuous calibration against real outcomes.
A quarterly recalibration cadence works for most teams. If your market or product is changing fast, monthly.
The Complete Flow
Here's how the pieces connect:
- A lead submits a form with their email.
- The enrichment pipeline fills in seniority, company size, industry, job function, and years of experience. (See the pipeline tutorial for setup.)
- The scoring model calculates a demographic score from the enriched fields.
- Behavioral tracking adds engagement signals over time.
- The combined score determines routing: fast-track to sales, marketing nurture, or automated sequence.
- Scores decay over time. Stale leads get re-enriched and rescored.
- Monthly health checks validate that the model still predicts conversions.
The model doesn't need to be perfect on day one. Start with the five core enrichment fields, set rough thresholds based on your historical data, and refine from there. The advantage of a point-based model is that it's transparent. When a rep asks why a lead scored 75, you can show them exactly which attributes contributed. When the model drifts, you can see which field lost its predictive power.
Scoring logic is the easy part. Complete, accurate data to score against is what most teams are missing. Get both person and company enrichment right and the scoring follows.