Lead Scoring with Enriched Data

Sales reps spend roughly 30% of their time actually selling, according to Salesforce's State of Sales report. The rest goes to admin work, internal meetings, and researching leads that may not be worth pursuing. A lead scoring model is supposed to fix this by telling reps which leads deserve their attention first.

In practice, most scoring models underperform. The reason is usually the same: the data feeding them is incomplete. A lead fills out a form with their email address and first name. The scoring model needs seniority level, company size, industry, and job function to produce a useful score. Without enrichment, those fields are empty, and the model defaults to treating every lead the same.

This article walks through how to build a lead scoring model that uses enrichment data as its foundation. Not a conceptual overview (there are plenty of those), but the actual implementation: choosing which fields to score, calibrating point values from your conversion data, setting thresholds, and keeping the model accurate over time.

If you haven't set up an enrichment pipeline yet, start with Building a Real-Time Enrichment Pipeline with a Person API. That covers the plumbing. This article covers what you do with the data once it's flowing.

Why Enrichment Is the Prerequisite

Lead scoring without enrichment is scoring in the dark. Most inbound never even gets worked: in a 2024 study of 1,000 B2B companies, 63.5% of leads were never responded to at all. When every lead looks the same, reps can't tell which ones are worth the first call, so the qualified ones get buried with the rest.

The fix seems obvious: score the leads first. But scoring requires data, and most inbound leads arrive sparse. A typical web form collects an email, maybe a name, maybe a company. That's not enough to determine whether someone is a VP at a 500-person SaaS company or an intern at a startup.

Enrichment fills those gaps at the moment the lead enters your system. An email address goes into the person enrichment API; a full professional profile comes back: job title, seniority level, company name, company size, industry, job function, location, years of experience. For account-level signals, a company enrichment API adds firmographic depth: employee count, industry, company size, growth trends. Now the scoring model has something to work with at both the contact and account level.

Pythonimport os
import requests


def enrich_lead(email):
    """Enrich a lead and return scoring-relevant fields."""
    response = requests.post(
        "https://api.datalegion.ai/person/enrich",
        headers={"API-Key": os.environ["DATALEGION_API_KEY"]},
        json={"email": email},
    )

    if response.status_code == 404:
        return None
    response.raise_for_status()

    data = response.json()
    matches = data.get("matches", [])
    if not matches:
        return None

    person = matches[0]["person"]
    return {
        "seniority_level": person.get("seniority_level"),
        "job_function": person.get("job_function"),
        "company_size": person.get("company_size"),
        "company_industry": person.get("company_industry"),
        "years_of_experience": person.get("years_of_experience"),
    }

JavaScript// Enrich a lead and return scoring-relevant fields.
async function enrichLead(email) {
  const response = await fetch("https://api.datalegion.ai/person/enrich", {
    method: "POST",
    headers: {
      "API-Key": process.env.DATALEGION_API_KEY,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ email }),
  });

  if (response.status === 404) return null;
  if (!response.ok) throw new Error(`enrich failed: ${response.status}`);

  const data = await response.json();
  const person = data.matches?.[0]?.person;
  if (!person) return null;

  return {
    seniority_level: person.seniority_level,
    job_function: person.job_function,
    company_size: person.company_size,
    company_industry: person.company_industry,
    years_of_experience: person.years_of_experience,
  };
}

Five fields. That's the minimum set that drives most B2B lead scoring. Everything else (behavioral signals, engagement data, technographics) layers on top.

Defining Your Ideal Customer Profile First

Before assigning point values, you need to know what a good lead looks like for your business. This is your Ideal Customer Profile: the combination of attributes shared by your best customers.

Pull your closed-won deals from the last 12 months. For each deal, look at the contact who entered the pipeline first and note their:

Seniority level (C-suite, VP, Director, Manager, Individual Contributor)
Job function (Engineering, Sales, Marketing, Operations, etc.)
Company size (by employee count band)
Industry
Years of experience

You're looking for concentrations. If 60% of your closed-won deals came from Director+ contacts at companies with 200-2,000 employees in SaaS or financial services, that's your ICP. The scoring model should reflect those patterns, not your assumptions about who should be buying.

Python# Example: Analyzing closed-won deals to find ICP patterns
from collections import Counter


def analyze_closed_won(deals):
    """Count attribute frequencies across closed-won deals."""
    seniority_counts = Counter(d["seniority_level"] for d in deals if d.get("seniority_level"))
    size_counts = Counter(d["company_size"] for d in deals if d.get("company_size"))
    industry_counts = Counter(d["company_industry"] for d in deals if d.get("company_industry"))
    function_counts = Counter(d["job_function"] for d in deals if d.get("job_function"))

    return {
        "seniority": seniority_counts.most_common(5),
        "company_size": size_counts.most_common(5),
        "industry": industry_counts.most_common(10),
        "job_function": function_counts.most_common(5),
    }

JavaScript// Count attribute frequencies across closed-won deals.
function analyzeClosedWon(deals) {
  const count = (field) => {
    const tally = {};
    for (const d of deals) {
      const value = d[field];
      if (value) tally[value] = (tally[value] ?? 0) + 1;
    }
    return Object.entries(tally).sort((a, b) => b[1] - a[1]);
  };

  return {
    seniority: count("seniority_level").slice(0, 5),
    company_size: count("company_size").slice(0, 5),
    industry: count("company_industry").slice(0, 10),
    job_function: count("job_function").slice(0, 5),
  };
}

Run this against your CRM export. The output tells you which attribute values to reward in your scoring model and, just as importantly, which ones to ignore.

Building the Scoring Model

A point-based scoring model assigns numeric values to lead attributes. Each attribute-value pair gets a score. The total determines whether the lead is worth pursuing.

Here's a scoring function built from the ICP analysis above:

Python# Scoring rules calibrated from closed-won data
SCORING_RULES = {
    "seniority_level": {
        "c_level": 30,
        "vp": 25,
        "director": 20,
        "manager": 10,
        "senior": 5,
    },
    "company_size": {
        "201-500": 20,
        "501-1000": 20,
        "1001-5000": 15,
        "51-200": 10,
        "5001-10000": 10,
    },
    "company_industry": {
        "technology, information and internet": 15,
        "financial services": 15,
        "software development": 15,
        "hospitals and health care": 10,
    },
    "job_function": {
        "engineering": 15,
        "sales": 15,
        "marketing": 10,
        "operations": 10,
    },
}

# Bonus for years of experience
EXPERIENCE_BRACKETS = [
    (10, 15),   # 10+ years
    (5, 10),    # 5-9 years
    (2, 5),     # 2-4 years
]


def score_lead(enriched_data):
    """Score a lead based on enriched attributes."""
    if not enriched_data:
        return 0

    total = 0

    for field, values in SCORING_RULES.items():
        lead_value = enriched_data.get(field, "")
        if lead_value and lead_value in values:
            total += values[lead_value]

    yoe = enriched_data.get("years_of_experience")
    if yoe is not None:
        for min_years, points in EXPERIENCE_BRACKETS:
            if yoe >= min_years:
                total += points
                break

    return total

JavaScript// Scoring rules calibrated from closed-won data.
const SCORING_RULES = {
  seniority_level: { c_level: 30, vp: 25, director: 20, manager: 10, senior: 5 },
  company_size: {
    "201-500": 20,
    "501-1000": 20,
    "1001-5000": 15,
    "51-200": 10,
    "5001-10000": 10,
  },
  company_industry: {
    "technology, information and internet": 15,
    "financial services": 15,
    "software development": 15,
    "hospitals and health care": 10,
  },
  job_function: { engineering: 15, sales: 15, marketing: 10, operations: 10 },
};

// Bonus for years of experience: [minYears, points].
const EXPERIENCE_BRACKETS = [
  [10, 15], // 10+ years
  [5, 10], // 5-9 years
  [2, 5], // 2-4 years
];

// Score a lead based on enriched attributes.
function scoreLead(enriched) {
  if (!enriched) return 0;

  let total = 0;
  for (const [field, values] of Object.entries(SCORING_RULES)) {
    const value = enriched[field];
    if (value && value in values) total += values[value];
  }

  const yoe = enriched.years_of_experience;
  if (yoe != null) {
    for (const [minYears, points] of EXPERIENCE_BRACKETS) {
      if (yoe >= minYears) {
        total += points;
        break;
      }
    }
  }

  return total;
}

A few things to notice about this model.

The point values aren't arbitrary. They reflect the relative importance of each attribute in your closed-won deals. If 70% of your wins came from Director+ contacts, seniority gets the highest weight. If company size matters less (you close deals across all sizes), its maximum score is lower.

Not every value scores. An individual contributor at a 10-person company in an unrelated industry scores 0. That's by design. The model should produce a wide range of scores, with clear separation between good and bad leads.

Experience acts as a bonus, not a primary signal. Years of experience correlates with seniority but isn't redundant. A senior individual contributor with 15 years of experience is a different lead than one with 2 years, even if they share the same seniority label.

Calibrating Point Values from Conversion Data

The scoring rules above are a starting point. To calibrate them properly, you need to look at how each attribute correlates with actual conversions.

The process:

Export all leads from the last 6-12 months with their enrichment data and outcome (converted vs. didn't convert).
For each attribute value, calculate the conversion rate.
Assign points proportional to how much each value lifts conversion above your baseline.

Pythondef calculate_conversion_rates(leads):
    """Calculate conversion rate per attribute value."""
    rates = {}

    for field in ["seniority_level", "company_size", "company_industry", "job_function"]:
        value_stats = {}

        for lead in leads:
            value = lead.get(field)
            if not value:
                continue

            if value not in value_stats:
                value_stats[value] = {"total": 0, "converted": 0}

            value_stats[value]["total"] += 1
            if lead.get("converted"):
                value_stats[value]["converted"] += 1

        rates[field] = {
            value: stats["converted"] / stats["total"]
            for value, stats in value_stats.items()
            if stats["total"] >= 10  # minimum sample size
        }

    return rates

JavaScript// Calculate conversion rate per attribute value.
function calculateConversionRates(leads) {
  const rates = {};
  const fields = [
    "seniority_level",
    "company_size",
    "company_industry",
    "job_function",
  ];

  for (const field of fields) {
    const valueStats = {};

    for (const lead of leads) {
      const value = lead[field];
      if (!value) continue;
      valueStats[value] ??= { total: 0, converted: 0 };
      valueStats[value].total += 1;
      if (lead.converted) valueStats[value].converted += 1;
    }

    rates[field] = {};
    for (const [value, stats] of Object.entries(valueStats)) {
      // minimum sample size
      if (stats.total >= 10) {
        rates[field][value] = stats.converted / stats.total;
      }
    }
  }

  return rates;
}

If your baseline conversion rate is 5% and VPs convert at 18%, that 3.6x lift justifies the highest seniority score. If managers convert at 7% (1.4x lift), they get a proportionally smaller score. If individual contributors convert at 2% (below baseline), they get zero or negative points.

This data-driven approach catches things your intuition might miss. Maybe your ICP says "Director+" but the data shows that senior managers at mid-market companies actually convert better than VPs at enterprise companies. The model should reflect what the data says, not what the sales team believes.

Adding Behavioral Scoring

Enrichment data gives you a demographic score: who the person is. Behavioral scoring adds a second dimension: what the person is doing.

Common behavioral signals and their typical weights:

PythonBEHAVIORAL_SCORES = {
    "visited_pricing_page": 20,
    "downloaded_whitepaper": 10,
    "attended_webinar": 15,
    "requested_demo": 30,
    "visited_docs": 10,
    "opened_email_3x": 5,
    "visited_site_3_plus_times": 10,
}


def score_lead_combined(enriched_data, behavioral_events):
    """Combine demographic and behavioral scoring."""
    demographic_score = score_lead(enriched_data)

    behavioral_score = 0
    for event in behavioral_events:
        event_type = event.get("type")
        if event_type in BEHAVIORAL_SCORES:
            behavioral_score += BEHAVIORAL_SCORES[event_type]

    return {
        "demographic": demographic_score,
        "behavioral": behavioral_score,
        "total": demographic_score + behavioral_score,
    }

JavaScriptconst BEHAVIORAL_SCORES = {
  visited_pricing_page: 20,
  downloaded_whitepaper: 10,
  attended_webinar: 15,
  requested_demo: 30,
  visited_docs: 10,
  opened_email_3x: 5,
  visited_site_3_plus_times: 10,
};

// Combine demographic and behavioral scoring.
function scoreLeadCombined(enriched, behavioralEvents) {
  const demographic = scoreLead(enriched);

  let behavioral = 0;
  for (const event of behavioralEvents) {
    if (event.type in BEHAVIORAL_SCORES) {
      behavioral += BEHAVIORAL_SCORES[event.type];
    }
  }

  return { demographic, behavioral, total: demographic + behavioral };
}

Keeping the two scores separate (even though you sum them for a total) matters for routing. A lead with high demographic score but low behavioral score is a good fit who hasn't engaged yet. That might go into a nurture sequence. A lead with low demographic score but high behavioral score (an IC who visited your pricing page three times and requested a demo) might be a champion who can introduce you to the decision maker.

Demographic	Behavioral	Routing
High	High	Fast-track to sales
High	Low	Marketing nurture
Low	High	Qualify further (could be a champion)
Low	Low	Automated sequence or deprioritize

This two-axis model is more useful than a single combined score because it tells sales not just whether to pursue a lead, but how.

Setting MQL and SQL Thresholds

A score is meaningless without a threshold. The threshold defines when a lead becomes a Marketing Qualified Lead (MQL) and when it becomes a Sales Qualified Lead (SQL) ready for handoff.

MQL-to-SQL conversion rates average 12-21% across industries, with top performers reaching 25-35%. If your conversion rate is significantly below that range, your threshold is likely too low (you're passing unqualified leads to sales) or too high (you're holding back leads that are ready).

To set the initial threshold:

Score all your historical leads retroactively using the model.
Split them into score bands (0-20, 21-40, 41-60, 61-80, 81-100).
Calculate the conversion rate for each band.
Set the MQL threshold where conversion rate exceeds your baseline by 2x or more.
Set the SQL threshold where conversion rate exceeds your baseline by 4x or more.

Pythondef analyze_score_bands(scored_leads, band_size=20):
    """Analyze conversion rates by score band."""
    bands = {}

    for lead in scored_leads:
        score = lead["score"]
        band = (score // band_size) * band_size
        band_label = f"{band}-{band + band_size - 1}"

        if band_label not in bands:
            bands[band_label] = {"total": 0, "converted": 0}

        bands[band_label]["total"] += 1
        if lead.get("converted"):
            bands[band_label]["converted"] += 1

    for band_label, stats in sorted(bands.items()):
        rate = stats["converted"] / stats["total"] if stats["total"] > 0 else 0
        print(f"  {band_label}: {rate:.1%} ({stats['converted']}/{stats['total']})")

JavaScript// Analyze conversion rates by score band.
function analyzeScoreBands(scoredLeads, bandSize = 20) {
  const bands = {};

  for (const lead of scoredLeads) {
    const band = Math.floor(lead.score / bandSize) * bandSize;
    const label = `${band}-${band + bandSize - 1}`;
    bands[label] ??= { total: 0, converted: 0 };
    bands[label].total += 1;
    if (lead.converted) bands[label].converted += 1;
  }

  for (const label of Object.keys(bands).sort()) {
    const { total, converted } = bands[label];
    const rate = total > 0 ? converted / total : 0;
    console.log(`  ${label}: ${(rate * 100).toFixed(1)}% (${converted}/${total})`);
  }
}

A typical output might look like:

Text  0-19:   2.1% (8/380)
  20-39:  4.8% (15/312)
  40-59:  11.3% (28/248)
  60-79:  22.7% (34/150)
  80-99:  38.5% (25/65)

In this example, the baseline is around 5%. The 40-59 band shows a 2x+ lift, so that's a reasonable MQL threshold. The 60-79 band shows 4x+ lift, making 60 a good SQL threshold. Leads scoring 80+ are your best opportunities and should be routed to senior reps immediately.

These numbers will be different for every business. The method matters more than the specific values.

Score Decay

A lead scored 85 three months ago is not the same quality as a lead scored 85 today. Contact data goes stale as people move: U.S. median job tenure is just 3.9 years. People change jobs, companies restructure, roles shift. A VP who was a perfect fit in January may have moved to a different company by April.

Behavioral signals decay even faster. A pricing page visit from last week signals intent. The same visit from six months ago doesn't.

Implement decay by reducing scores over time:

Pythonfrom datetime import datetime


def apply_score_decay(lead):
    """Reduce scores based on data age."""
    days_since_enrichment = (datetime.now() - lead["last_enriched"]).days
    days_since_activity = (datetime.now() - lead["last_activity"]).days

    # Demographic score: decay 10% per month after 90 days
    demographic_decay = 1.0
    if days_since_enrichment > 90:
        months_stale = (days_since_enrichment - 90) / 30
        demographic_decay = max(0.3, 1.0 - (months_stale * 0.1))

    # Behavioral score: decay 25% per month after 30 days
    behavioral_decay = 1.0
    if days_since_activity > 30:
        months_inactive = (days_since_activity - 30) / 30
        behavioral_decay = max(0.1, 1.0 - (months_inactive * 0.25))

    return {
        "demographic": int(lead["demographic_score"] * demographic_decay),
        "behavioral": int(lead["behavioral_score"] * behavioral_decay),
        "total": int(
            lead["demographic_score"] * demographic_decay
            + lead["behavioral_score"] * behavioral_decay
        ),
    }

JavaScript// Reduce scores based on data age.
function applyScoreDecay(lead) {
  const day = 86_400_000;
  const daysSinceEnrichment = (Date.now() - lead.last_enriched) / day;
  const daysSinceActivity = (Date.now() - lead.last_activity) / day;

  // Demographic score: decay 10% per month after 90 days.
  let demographicDecay = 1.0;
  if (daysSinceEnrichment > 90) {
    const monthsStale = (daysSinceEnrichment - 90) / 30;
    demographicDecay = Math.max(0.3, 1.0 - monthsStale * 0.1);
  }

  // Behavioral score: decay 25% per month after 30 days.
  let behavioralDecay = 1.0;
  if (daysSinceActivity > 30) {
    const monthsInactive = (daysSinceActivity - 30) / 30;
    behavioralDecay = Math.max(0.1, 1.0 - monthsInactive * 0.25);
  }

  return {
    demographic: Math.trunc(lead.demographic_score * demographicDecay),
    behavioral: Math.trunc(lead.behavioral_score * behavioralDecay),
    total: Math.trunc(
      lead.demographic_score * demographicDecay +
        lead.behavioral_score * behavioralDecay,
    ),
  };
}

Two things to note about this approach. Demographic scores decay slower than behavioral scores because professional attributes change less frequently than engagement signals. And neither score decays to zero. A lead who was a strong demographic fit six months ago is still worth re-engaging if they show new activity.

The fix for demographic decay is re-enrichment. Run stale leads (90+ days since last enrichment) through the enrichment API again to refresh their profile data, then rescore. This is the same hygiene pattern described in the GTM engineer's guide, applied specifically to scoring.

Measuring Whether the Model Works

A scoring model that nobody validates will drift. Conversion patterns change. Your product evolves. New competitors shift who responds to your marketing. You need to check whether the scores still predict outcomes.

Track these four metrics monthly:

Conversion rate by score band is the core health check. If your top score band stops converting at a meaningfully higher rate than your middle bands, the model has lost its predictive power.

MQL-to-SQL acceptance rate tells you whether sales agrees with marketing's definition of "qualified." If acceptance is below 50%, your MQL threshold is too low or your scoring criteria don't match what sales considers qualified.

Score distribution reveals whether the model differentiates at all. If 80% of leads score between 40-60, it isn't. A good model produces a spread, with clear clusters at the low end (bad fit), middle (worth nurturing), and high end (ready for sales).

Time to close by score band validates that the model predicts purchase readiness, not just fit. High-scoring leads should close faster. If they don't, the model is measuring something else.

Pythondef monthly_model_health(leads_this_month):
    """Run monthly health checks on the scoring model."""
    by_band = {}
    for lead in leads_this_month:
        band = (lead["score"] // 20) * 20
        if band not in by_band:
            by_band[band] = {"total": 0, "converted": 0, "days_to_close": []}

        by_band[band]["total"] += 1
        if lead.get("converted"):
            by_band[band]["converted"] += 1
            if lead.get("days_to_close"):
                by_band[band]["days_to_close"].append(lead["days_to_close"])

    for band in sorted(by_band):
        stats = by_band[band]
        rate = stats["converted"] / stats["total"] if stats["total"] > 0 else 0
        avg_days = (
            sum(stats["days_to_close"]) / len(stats["days_to_close"])
            if stats["days_to_close"]
            else None
        )
        line = f"  {band}-{band + 19}: {rate:.1%} conversion"
        if avg_days is not None:
            line += f", avg close: {avg_days:.0f} days"
        print(line)

JavaScript// Run monthly health checks on the scoring model.
function monthlyModelHealth(leadsThisMonth) {
  const byBand = {};

  for (const lead of leadsThisMonth) {
    const band = Math.floor(lead.score / 20) * 20;
    byBand[band] ??= { total: 0, converted: 0, daysToClose: [] };
    byBand[band].total += 1;
    if (lead.converted) {
      byBand[band].converted += 1;
      if (lead.days_to_close) byBand[band].daysToClose.push(lead.days_to_close);
    }
  }

  const bands = Object.keys(byBand).map(Number).sort((a, b) => a - b);
  for (const band of bands) {
    const stats = byBand[band];
    const rate = stats.total > 0 ? stats.converted / stats.total : 0;
    let line = `  ${band}-${band + 19}: ${(rate * 100).toFixed(1)}% conversion`;
    if (stats.daysToClose.length) {
      const sum = stats.daysToClose.reduce((a, b) => a + b, 0);
      line += `, avg close: ${(sum / stats.daysToClose.length).toFixed(0)} days`;
    }
    console.log(line);
  }
}

When the numbers drift, recalibrate. Rerun the conversion rate analysis from the "Calibrating Point Values" section using the most recent 6 months of data. Adjust the point values. Move the thresholds. This isn't a failure of the model. It's how scoring models are supposed to work: continuous calibration against real outcomes.

A quarterly recalibration cadence works for most teams. If your market or product is changing fast, monthly.

The Complete Flow

Here's how the pieces connect:

A lead submits a form with their email.
The enrichment pipeline fills in seniority, company size, industry, job function, and years of experience. (See the pipeline tutorial for setup.)
The scoring model calculates a demographic score from the enriched fields.
Behavioral tracking adds engagement signals over time.
The combined score determines routing: fast-track to sales, marketing nurture, or automated sequence.
Scores decay over time. Stale leads get re-enriched and rescored.
Monthly health checks validate that the model still predicts conversions.

The model doesn't need to be perfect on day one. Start with the five core enrichment fields, set rough thresholds based on your historical data, and refine from there. The advantage of a point-based model is that it's transparent. When a rep asks why a lead scored 75, you can show them exactly which attributes contributed. When the model drifts, you can see which field lost its predictive power.

Scoring logic is the easy part. Complete, accurate data to score against is what most teams are missing. Get both person and company enrichment right and the scoring follows.

Why Enrichment Is the Prerequisite

Pythonimport os
import requests


def enrich_lead(email):
    """Enrich a lead and return scoring-relevant fields."""
    response = requests.post(
        "https://api.datalegion.ai/person/enrich",
        headers={"API-Key": os.environ["DATALEGION_API_KEY"]},
        json={"email": email},
    )

    if response.status_code == 404:
        return None
    response.raise_for_status()

    data = response.json()
    matches = data.get("matches", [])
    if not matches:
        return None

    person = matches[0]["person"]
    return {
        "seniority_level": person.get("seniority_level"),
        "job_function": person.get("job_function"),
        "company_size": person.get("company_size"),
        "company_industry": person.get("company_industry"),
        "years_of_experience": person.get("years_of_experience"),
    }

JavaScript// Enrich a lead and return scoring-relevant fields.
async function enrichLead(email) {
  const response = await fetch("https://api.datalegion.ai/person/enrich", {
    method: "POST",
    headers: {
      "API-Key": process.env.DATALEGION_API_KEY,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ email }),
  });

  if (response.status === 404) return null;
  if (!response.ok) throw new Error(`enrich failed: ${response.status}`);

  const data = await response.json();
  const person = data.matches?.[0]?.person;
  if (!person) return null;

  return {
    seniority_level: person.seniority_level,
    job_function: person.job_function,
    company_size: person.company_size,
    company_industry: person.company_industry,
    years_of_experience: person.years_of_experience,
  };
}

Five fields. That's the minimum set that drives most B2B lead scoring. Everything else (behavioral signals, engagement data, technographics) layers on top.

Defining Your Ideal Customer Profile First

Before assigning point values, you need to know what a good lead looks like for your business. This is your Ideal Customer Profile: the combination of attributes shared by your best customers.

Pull your closed-won deals from the last 12 months. For each deal, look at the contact who entered the pipeline first and note their:

Seniority level (C-suite, VP, Director, Manager, Individual Contributor)
Job function (Engineering, Sales, Marketing, Operations, etc.)
Company size (by employee count band)
Industry
Years of experience

Python# Example: Analyzing closed-won deals to find ICP patterns
from collections import Counter


def analyze_closed_won(deals):
    """Count attribute frequencies across closed-won deals."""
    seniority_counts = Counter(d["seniority_level"] for d in deals if d.get("seniority_level"))
    size_counts = Counter(d["company_size"] for d in deals if d.get("company_size"))
    industry_counts = Counter(d["company_industry"] for d in deals if d.get("company_industry"))
    function_counts = Counter(d["job_function"] for d in deals if d.get("job_function"))

    return {
        "seniority": seniority_counts.most_common(5),
        "company_size": size_counts.most_common(5),
        "industry": industry_counts.most_common(10),
        "job_function": function_counts.most_common(5),
    }

JavaScript// Count attribute frequencies across closed-won deals.
function analyzeClosedWon(deals) {
  const count = (field) => {
    const tally = {};
    for (const d of deals) {
      const value = d[field];
      if (value) tally[value] = (tally[value] ?? 0) + 1;
    }
    return Object.entries(tally).sort((a, b) => b[1] - a[1]);
  };

  return {
    seniority: count("seniority_level").slice(0, 5),
    company_size: count("company_size").slice(0, 5),
    industry: count("company_industry").slice(0, 10),
    job_function: count("job_function").slice(0, 5),
  };
}

Run this against your CRM export. The output tells you which attribute values to reward in your scoring model and, just as importantly, which ones to ignore.

Building the Scoring Model

A point-based scoring model assigns numeric values to lead attributes. Each attribute-value pair gets a score. The total determines whether the lead is worth pursuing.

Here's a scoring function built from the ICP analysis above:

Python# Scoring rules calibrated from closed-won data
SCORING_RULES = {
    "seniority_level": {
        "c_level": 30,
        "vp": 25,
        "director": 20,
        "manager": 10,
        "senior": 5,
    },
    "company_size": {
        "201-500": 20,
        "501-1000": 20,
        "1001-5000": 15,
        "51-200": 10,
        "5001-10000": 10,
    },
    "company_industry": {
        "technology, information and internet": 15,
        "financial services": 15,
        "software development": 15,
        "hospitals and health care": 10,
    },
    "job_function": {
        "engineering": 15,
        "sales": 15,
        "marketing": 10,
        "operations": 10,
    },
}

# Bonus for years of experience
EXPERIENCE_BRACKETS = [
    (10, 15),   # 10+ years
    (5, 10),    # 5-9 years
    (2, 5),     # 2-4 years
]


def score_lead(enriched_data):
    """Score a lead based on enriched attributes."""
    if not enriched_data:
        return 0

    total = 0

    for field, values in SCORING_RULES.items():
        lead_value = enriched_data.get(field, "")
        if lead_value and lead_value in values:
            total += values[lead_value]

    yoe = enriched_data.get("years_of_experience")
    if yoe is not None:
        for min_years, points in EXPERIENCE_BRACKETS:
            if yoe >= min_years:
                total += points
                break

    return total

JavaScript// Scoring rules calibrated from closed-won data.
const SCORING_RULES = {
  seniority_level: { c_level: 30, vp: 25, director: 20, manager: 10, senior: 5 },
  company_size: {
    "201-500": 20,
    "501-1000": 20,
    "1001-5000": 15,
    "51-200": 10,
    "5001-10000": 10,
  },
  company_industry: {
    "technology, information and internet": 15,
    "financial services": 15,
    "software development": 15,
    "hospitals and health care": 10,
  },
  job_function: { engineering: 15, sales: 15, marketing: 10, operations: 10 },
};

// Bonus for years of experience: [minYears, points].
const EXPERIENCE_BRACKETS = [
  [10, 15], // 10+ years
  [5, 10], // 5-9 years
  [2, 5], // 2-4 years
];

// Score a lead based on enriched attributes.
function scoreLead(enriched) {
  if (!enriched) return 0;

  let total = 0;
  for (const [field, values] of Object.entries(SCORING_RULES)) {
    const value = enriched[field];
    if (value && value in values) total += values[value];
  }

  const yoe = enriched.years_of_experience;
  if (yoe != null) {
    for (const [minYears, points] of EXPERIENCE_BRACKETS) {
      if (yoe >= minYears) {
        total += points;
        break;
      }
    }
  }

  return total;
}

A few things to notice about this model.

Calibrating Point Values from Conversion Data

The scoring rules above are a starting point. To calibrate them properly, you need to look at how each attribute correlates with actual conversions.

The process:

Export all leads from the last 6-12 months with their enrichment data and outcome (converted vs. didn't convert).
For each attribute value, calculate the conversion rate.
Assign points proportional to how much each value lifts conversion above your baseline.

Pythondef calculate_conversion_rates(leads):
    """Calculate conversion rate per attribute value."""
    rates = {}

    for field in ["seniority_level", "company_size", "company_industry", "job_function"]:
        value_stats = {}

        for lead in leads:
            value = lead.get(field)
            if not value:
                continue

            if value not in value_stats:
                value_stats[value] = {"total": 0, "converted": 0}

            value_stats[value]["total"] += 1
            if lead.get("converted"):
                value_stats[value]["converted"] += 1

        rates[field] = {
            value: stats["converted"] / stats["total"]
            for value, stats in value_stats.items()
            if stats["total"] >= 10  # minimum sample size
        }

    return rates

JavaScript// Calculate conversion rate per attribute value.
function calculateConversionRates(leads) {
  const rates = {};
  const fields = [
    "seniority_level",
    "company_size",
    "company_industry",
    "job_function",
  ];

  for (const field of fields) {
    const valueStats = {};

    for (const lead of leads) {
      const value = lead[field];
      if (!value) continue;
      valueStats[value] ??= { total: 0, converted: 0 };
      valueStats[value].total += 1;
      if (lead.converted) valueStats[value].converted += 1;
    }

    rates[field] = {};
    for (const [value, stats] of Object.entries(valueStats)) {
      // minimum sample size
      if (stats.total >= 10) {
        rates[field][value] = stats.converted / stats.total;
      }
    }
  }

  return rates;
}

Adding Behavioral Scoring

Enrichment data gives you a demographic score: who the person is. Behavioral scoring adds a second dimension: what the person is doing.

Common behavioral signals and their typical weights:

PythonBEHAVIORAL_SCORES = {
    "visited_pricing_page": 20,
    "downloaded_whitepaper": 10,
    "attended_webinar": 15,
    "requested_demo": 30,
    "visited_docs": 10,
    "opened_email_3x": 5,
    "visited_site_3_plus_times": 10,
}


def score_lead_combined(enriched_data, behavioral_events):
    """Combine demographic and behavioral scoring."""
    demographic_score = score_lead(enriched_data)

    behavioral_score = 0
    for event in behavioral_events:
        event_type = event.get("type")
        if event_type in BEHAVIORAL_SCORES:
            behavioral_score += BEHAVIORAL_SCORES[event_type]

    return {
        "demographic": demographic_score,
        "behavioral": behavioral_score,
        "total": demographic_score + behavioral_score,
    }

JavaScriptconst BEHAVIORAL_SCORES = {
  visited_pricing_page: 20,
  downloaded_whitepaper: 10,
  attended_webinar: 15,
  requested_demo: 30,
  visited_docs: 10,
  opened_email_3x: 5,
  visited_site_3_plus_times: 10,
};

// Combine demographic and behavioral scoring.
function scoreLeadCombined(enriched, behavioralEvents) {
  const demographic = scoreLead(enriched);

  let behavioral = 0;
  for (const event of behavioralEvents) {
    if (event.type in BEHAVIORAL_SCORES) {
      behavioral += BEHAVIORAL_SCORES[event.type];
    }
  }

  return { demographic, behavioral, total: demographic + behavioral };
}

Demographic	Behavioral	Routing
High	High	Fast-track to sales
High	Low	Marketing nurture
Low	High	Qualify further (could be a champion)
Low	Low	Automated sequence or deprioritize

This two-axis model is more useful than a single combined score because it tells sales not just whether to pursue a lead, but how.

Setting MQL and SQL Thresholds

A score is meaningless without a threshold. The threshold defines when a lead becomes a Marketing Qualified Lead (MQL) and when it becomes a Sales Qualified Lead (SQL) ready for handoff.

To set the initial threshold:

Score all your historical leads retroactively using the model.
Split them into score bands (0-20, 21-40, 41-60, 61-80, 81-100).
Calculate the conversion rate for each band.
Set the MQL threshold where conversion rate exceeds your baseline by 2x or more.
Set the SQL threshold where conversion rate exceeds your baseline by 4x or more.

Pythondef analyze_score_bands(scored_leads, band_size=20):
    """Analyze conversion rates by score band."""
    bands = {}

    for lead in scored_leads:
        score = lead["score"]
        band = (score // band_size) * band_size
        band_label = f"{band}-{band + band_size - 1}"

        if band_label not in bands:
            bands[band_label] = {"total": 0, "converted": 0}

        bands[band_label]["total"] += 1
        if lead.get("converted"):
            bands[band_label]["converted"] += 1

    for band_label, stats in sorted(bands.items()):
        rate = stats["converted"] / stats["total"] if stats["total"] > 0 else 0
        print(f"  {band_label}: {rate:.1%} ({stats['converted']}/{stats['total']})")

JavaScript// Analyze conversion rates by score band.
function analyzeScoreBands(scoredLeads, bandSize = 20) {
  const bands = {};

  for (const lead of scoredLeads) {
    const band = Math.floor(lead.score / bandSize) * bandSize;
    const label = `${band}-${band + bandSize - 1}`;
    bands[label] ??= { total: 0, converted: 0 };
    bands[label].total += 1;
    if (lead.converted) bands[label].converted += 1;
  }

  for (const label of Object.keys(bands).sort()) {
    const { total, converted } = bands[label];
    const rate = total > 0 ? converted / total : 0;
    console.log(`  ${label}: ${(rate * 100).toFixed(1)}% (${converted}/${total})`);
  }
}

A typical output might look like:

Text  0-19:   2.1% (8/380)
  20-39:  4.8% (15/312)
  40-59:  11.3% (28/248)
  60-79:  22.7% (34/150)
  80-99:  38.5% (25/65)

These numbers will be different for every business. The method matters more than the specific values.

Score Decay

Behavioral signals decay even faster. A pricing page visit from last week signals intent. The same visit from six months ago doesn't.

Implement decay by reducing scores over time:

Pythonfrom datetime import datetime


def apply_score_decay(lead):
    """Reduce scores based on data age."""
    days_since_enrichment = (datetime.now() - lead["last_enriched"]).days
    days_since_activity = (datetime.now() - lead["last_activity"]).days

    # Demographic score: decay 10% per month after 90 days
    demographic_decay = 1.0
    if days_since_enrichment > 90:
        months_stale = (days_since_enrichment - 90) / 30
        demographic_decay = max(0.3, 1.0 - (months_stale * 0.1))

    # Behavioral score: decay 25% per month after 30 days
    behavioral_decay = 1.0
    if days_since_activity > 30:
        months_inactive = (days_since_activity - 30) / 30
        behavioral_decay = max(0.1, 1.0 - (months_inactive * 0.25))

    return {
        "demographic": int(lead["demographic_score"] * demographic_decay),
        "behavioral": int(lead["behavioral_score"] * behavioral_decay),
        "total": int(
            lead["demographic_score"] * demographic_decay
            + lead["behavioral_score"] * behavioral_decay
        ),
    }

JavaScript// Reduce scores based on data age.
function applyScoreDecay(lead) {
  const day = 86_400_000;
  const daysSinceEnrichment = (Date.now() - lead.last_enriched) / day;
  const daysSinceActivity = (Date.now() - lead.last_activity) / day;

  // Demographic score: decay 10% per month after 90 days.
  let demographicDecay = 1.0;
  if (daysSinceEnrichment > 90) {
    const monthsStale = (daysSinceEnrichment - 90) / 30;
    demographicDecay = Math.max(0.3, 1.0 - monthsStale * 0.1);
  }

  // Behavioral score: decay 25% per month after 30 days.
  let behavioralDecay = 1.0;
  if (daysSinceActivity > 30) {
    const monthsInactive = (daysSinceActivity - 30) / 30;
    behavioralDecay = Math.max(0.1, 1.0 - monthsInactive * 0.25);
  }

  return {
    demographic: Math.trunc(lead.demographic_score * demographicDecay),
    behavioral: Math.trunc(lead.behavioral_score * behavioralDecay),
    total: Math.trunc(
      lead.demographic_score * demographicDecay +
        lead.behavioral_score * behavioralDecay,
    ),
  };
}

Measuring Whether the Model Works

Track these four metrics monthly:

Conversion rate by score band is the core health check. If your top score band stops converting at a meaningfully higher rate than your middle bands, the model has lost its predictive power.

Time to close by score band validates that the model predicts purchase readiness, not just fit. High-scoring leads should close faster. If they don't, the model is measuring something else.

Pythondef monthly_model_health(leads_this_month):
    """Run monthly health checks on the scoring model."""
    by_band = {}
    for lead in leads_this_month:
        band = (lead["score"] // 20) * 20
        if band not in by_band:
            by_band[band] = {"total": 0, "converted": 0, "days_to_close": []}

        by_band[band]["total"] += 1
        if lead.get("converted"):
            by_band[band]["converted"] += 1
            if lead.get("days_to_close"):
                by_band[band]["days_to_close"].append(lead["days_to_close"])

    for band in sorted(by_band):
        stats = by_band[band]
        rate = stats["converted"] / stats["total"] if stats["total"] > 0 else 0
        avg_days = (
            sum(stats["days_to_close"]) / len(stats["days_to_close"])
            if stats["days_to_close"]
            else None
        )
        line = f"  {band}-{band + 19}: {rate:.1%} conversion"
        if avg_days is not None:
            line += f", avg close: {avg_days:.0f} days"
        print(line)

JavaScript// Run monthly health checks on the scoring model.
function monthlyModelHealth(leadsThisMonth) {
  const byBand = {};

  for (const lead of leadsThisMonth) {
    const band = Math.floor(lead.score / 20) * 20;
    byBand[band] ??= { total: 0, converted: 0, daysToClose: [] };
    byBand[band].total += 1;
    if (lead.converted) {
      byBand[band].converted += 1;
      if (lead.days_to_close) byBand[band].daysToClose.push(lead.days_to_close);
    }
  }

  const bands = Object.keys(byBand).map(Number).sort((a, b) => a - b);
  for (const band of bands) {
    const stats = byBand[band];
    const rate = stats.total > 0 ? stats.converted / stats.total : 0;
    let line = `  ${band}-${band + 19}: ${(rate * 100).toFixed(1)}% conversion`;
    if (stats.daysToClose.length) {
      const sum = stats.daysToClose.reduce((a, b) => a + b, 0);
      line += `, avg close: ${(sum / stats.daysToClose.length).toFixed(0)} days`;
    }
    console.log(line);
  }
}

A quarterly recalibration cadence works for most teams. If your market or product is changing fast, monthly.

The Complete Flow

Here's how the pieces connect:

A lead submits a form with their email.
The enrichment pipeline fills in seniority, company size, industry, job function, and years of experience. (See the pipeline tutorial for setup.)
The scoring model calculates a demographic score from the enriched fields.
Behavioral tracking adds engagement signals over time.
The combined score determines routing: fast-track to sales, marketing nurture, or automated sequence.
Scores decay over time. Stale leads get re-enriched and rescored.
Monthly health checks validate that the model still predicts conversions.

Scoring logic is the easy part. Complete, accurate data to score against is what most teams are missing. Get both person and company enrichment right and the scoring follows.

Lead Scoring with Enriched Data

Why Enrichment Is the Prerequisite

Defining Your Ideal Customer Profile First

Building the Scoring Model

Calibrating Point Values from Conversion Data

Adding Behavioral Scoring

Setting MQL and SQL Thresholds

Score Decay

Measuring Whether the Model Works

The Complete Flow

See the data for yourself

Related Articles

Building a Real-Time Enrichment Pipeline with a Person API

The GTM Engineer's Guide to Data Enrichment

How to Enrich Clay Tables with Data Legion

Lead Scoring with Enriched Data

Why Enrichment Is the Prerequisite

Defining Your Ideal Customer Profile First

Building the Scoring Model

Calibrating Point Values from Conversion Data

Adding Behavioral Scoring

Setting MQL and SQL Thresholds

Score Decay

Measuring Whether the Model Works

The Complete Flow

See the data for yourself

Related Articles

Building a Real-Time Enrichment Pipeline with a Person API

The GTM Engineer's Guide to Data Enrichment

How to Enrich Clay Tables with Data Legion