Skip to main content
Statistics on field coverage, confidence distribution, and dataset composition.

Dataset Snapshot

  • Total contacts: 188,800,817

Confidence Distribution (linkage confidence)

AttributehighmoderatelowTotal
Phones59,322,39414,326,90215,716,17989,365,475
Emails149,646,45655,094,36161,102,491265,843,308
Locations112,893,380105,554,268141,099,438359,547,086
Socials134,669,13117,498,46615,757,485167,925,082

Coverage Summary

FieldTypeDescriptionNullableFill RateCount
legion_idstringStable contact identifierNo100%188,800,817
full_namestringBest-available full nameNo100%188,800,817
first_namestringFirst nameNo100%188,800,817
middle_namestringMiddle name (non-initial)Yes5%8,937,558
middle_initialstringMiddle initial (provided or derived)Yes15%28,964,163
last_namestringLast nameNo100%188,800,817
last_initialstringLast name initialNo100%188,800,817
suffixstringName suffixYes1%2,261,273
prefixstringName prefixYes0%113,168
sexenumSex valueYes90%169,927,931
birth_datestringBirth date (precision preserved)Yes33%62,928,706
birth_yearintegerBirth yearYes33%62,928,706
birth_monthintegerBirth monthYes33%62,865,048
birth_dayintegerBirth dayYes1%1,911,777
ageintegerAge in yearsYes33%62,928,706
job_titlestringCurrent job titleYes66%124,694,156
company_legion_idstringCurrent company Data Legion IDYes44%83,604,753
company_namestringCurrent company nameYes64%121,073,363
company_domainstringCurrent company website domainYes43%80,599,071
company_linkedin_urlstringCurrent company LinkedIn URLYes45%84,982,711
company_linkedin_idstringCurrent company LinkedIn numeric IDYes44%83,604,753
company_industryenumCurrent company industry (see enums/industry.csv)Yes43%80,950,832
company_sizeenumCurrent company size (see enums/company_size.csv)Yes45%84,450,084
seniority_levelenumSeniority level (c_level/owner/partner/vp/director/manager/senior/junior/training/intern)Yes27%51,679,137
job_functionenumJob function (engineering/sales/marketing/healthcare/education/etc.)Yes66%124,605,695
expense_categoryenumP&L expense category (see enums/expense_category.csv)Yes66%124,605,695
is_decision_makerbooleanTrue if seniority is c_level/owner/partner/vp/directorYes100%188,800,817
is_platform_workerbooleanTrue if current job is platform/gig usage (e.g. Airbnb host, Uber driver), not employmentYes100%188,800,817
years_of_experienceintegerYears since earliest job start dateYes41%77,947,188
avg_tenure_monthsnumberAverage job tenure in months across all experience entriesYes41%77,902,647
highest_degree_levelenumHighest education level (doctorate/masters/bachelors/associates/high_school)Yes27%51,166,659
work_emailstringCurrent work email addressYes31%58,329,768
mobile_phonestringCurrent mobile phone numberYes24%45,732,581
citystringCurrent cityYes85%160,551,724
statestringCurrent state/provinceYes85%160,801,798
state_codestringCurrent state ISO 3166-2 code (e.g., US-CA, GB-ENG)Yes85%160,797,521
countrystringCurrent countryYes85%160,906,286
country_codestringCurrent country ISO 3166-1 alpha-2 code (e.g., US, GB)Yes85%160,906,286
linkedin_urlstringPrimary LinkedIn profile URLYes70%132,952,026
linkedin_idstringLinkedIn numeric member IDYes66%125,102,234
linkedin_followersintegerLinkedIn follower countYes61%115,859,463
linkedin_connectionsintegerLinkedIn connection countYes61%115,860,240
headlineobjectObject containing headline text and raw snippetsNo63%119,027,950
headline.cleanedstringHeadline textYes63%119,027,950
headline.raw[]arrayHeadline raw snippetsNo63%119,027,950
summaryobjectObject containing summary text and raw snippetsNo20%37,414,046
summary.cleanedstringSummary textYes20%37,414,046
summary.raw[]arraySummary raw snippetsNo20%37,414,046
skills[]arrayArray of skills objectsNo26%48,803,261
skills[].raw[]arraySkills raw snippetsNo100%1,024,915,593
skills[].cleanedstringSkills textYes100%1,024,915,593
languages[]arrayArray of languages objectsNo11%21,452,641
languages[].raw[]arrayLanguages raw snippetsNo100%34,150,452
languages[].cleanedstringLanguages textYes100%34,150,452
languages[].proficiencyenumLanguages proficiencyYes36%12,457,619
phones[]arrayContact has ≥1 phoneNo37%70,761,725
phones[].typeenumPhone typeNo100%89,365,475
phones[].numberstringPhone number presentNo100%89,365,475
phones[].currentbooleanPhone flagged currentNo100%89,365,475
phones[].confidenceenumPhone confidence bucketNo100%89,365,475
phones[].num_sourcesintegerNumber of sources that contributed to this phoneNo100%89,365,475
phones[].last_seenstringPhone last-seen dateYes75%66,935,458
emails[]arrayContact has ≥1 emailNo80%150,123,000
emails[].addressstringEmail addressNo100%265,843,308
emails[].typeenumEmail typeYes100%265,843,308
emails[].currentboolean/nullCurrent status (null for personal emails)Yes100%265,843,308
emails[].validatedbooleanEmail validatedNo100%265,843,308
emails[].validation_statusenumValidation statusYes11%28,660,682
emails[].confidenceenumEmail confidence bucketNo100%265,843,308
emails[].num_sourcesintegerNumber of sources that contributed to this emailNo100%265,843,308
emails[].last_seenstringEmail last-seen dateYes92%245,656,371
emails[].hash_sha256stringSHA-256 hash of normalized email addressYes100%265,843,308
emails[].hash_sha1stringSHA-1 hash of normalized email addressYes100%265,843,308
emails[].hash_md5stringMD5 hash of normalized email addressYes100%265,843,308
locations[]arrayContact has ≥1 normalized locationsNo100%188,552,896
locations[].street_addressstringStreet line presentYes73%263,455,803
locations[].address_line_2stringAddress line 2 presentYes9%33,749,047
locations[].citystringCity presentYes100%357,981,213
locations[].statestringState/province name (lowercase)Yes96%346,072,352
locations[].state_codestringISO 3166-2 code (e.g., US-NY)Yes96%345,514,959
locations[].countrystringCountry name (lowercase)Yes99%355,981,512
locations[].country_codestringISO 3166-1 alpha-2 (e.g., US)Yes99%355,975,338
locations[].postal_codestringPostal code presentYes75%269,234,594
locations[].postal_code_4stringZIP+4 suffix (4-digit, US only)Yes32%113,598,851
locations[].continentstringContinent nameYes99%355,975,280
locations[].continent_codestringContinent codeYes99%355,975,280
locations[].geostringLatitude,longitude coordinatesYes62%223,533,184
locations[].raw[]arrayRaw location from the sourceNo100%359,547,049
locations[].currentbooleanLocation flagged currentNo100%359,547,086
locations[].confidenceenumLocation confidence bucketYes100%359,547,086
locations[].num_sourcesintegerNumber of sources that contributed to this locationNo100%359,547,086
locations[].last_seenstringLocation last-seen dateYes60%217,430,243
experience[]arrayExperience entriesNo70%131,612,539
experience[].titleobjectExperience title objectYes100%421,826,645
experience[].title.cleanedstringExperience title cleaned valueYes100%421,620,248
experience[].title.raw[]arrayExperience title raw snippetsYes100%421,620,283
experience[].organizationobjectExperience organization objectYes100%421,826,645
experience[].organization.legion_idstringExperience organization Data Legion company IDYes75%316,233,808
experience[].organization.nameobjectExperience organization name objectYes100%421,826,645
experience[].organization.name.cleanedstringExperience organization name cleaned valueYes99%417,098,892
experience[].organization.name.raw[]arrayExperience organization name raw snippetsYes99%417,201,404
experience[].organization.websitestringExperience organization website URLYes72%304,056,180
experience[].organization.linkedin_urlstringExperience organization LinkedIn URLYes76%319,947,612
experience[].organization.linkedin_idstringExperience organization LinkedIn numeric IDYes75%316,233,808
experience[].organization.industryenumExperience organization industry (see enums/industry.csv)Yes72%305,220,291
experience[].organization.sizeenumExperience organization size (see enums/company_size.csv)Yes75%316,893,361
experience[].start_datestringExperience start dateYes85%358,396,863
experience[].end_datestringExperience end dateYes63%265,988,169
experience[].currentbooleanExperience current flagYes85%358,514,868
experience[].tenure_monthsintegerJob tenure in months (calculated from start_date and end_date)Yes85%358,374,550
experience[].seniority_levelenumSeniority level classification (see enums/seniority_level.csv)Yes44%186,877,993
experience[].job_functionenumJob function classification (see enums/job_function.csv)Yes100%421,384,453
experience[].expense_categoryenumP&L expense category (see enums/expense_category.csv)Yes100%421,384,453
experience[].is_decision_makerbooleanTrue if seniority is c_level/owner/partner/vp/directorYes100%421,826,645
experience[].is_platform_workerbooleanTrue if role is platform/gig usage (e.g. Airbnb host, Uber driver), not employmentNo100%421,826,645
experience[].descriptionobjectExperience description objectYes100%421,826,645
experience[].description.cleanedstringExperience description textYes44%187,327,596
experience[].description.raw[]arrayExperience description raw snippetsYes44%187,370,111
education[]arrayEducation entriesNo39%73,688,196
education[].organizationobjectEducation organization objectYes100%154,667,392
education[].organization.nameobjectEducation organization name objectYes100%154,667,392
education[].organization.name.cleanedstringEducation organization name cleaned valueYes100%154,667,384
education[].organization.name.raw[]arrayEducation organization name raw snippetsYes100%154,667,392
education[].organization.linkedin_urlstringEducation organization LinkedIn URLYes79%122,394,385
education[].organization.websitestringEducation organization website URLYes48%74,739,355
education[].degreeobjectEducation degree objectYes100%154,667,392
education[].degree.cleanedstringEducation degree cleaned valueYes68%104,746,339
education[].degree.raw[]arrayEducation degree raw snippetsYes68%104,746,394
education[].degree_levelenumDegree level (see enums/degree_level.csv)Yes55%85,086,335
education[].field_of_studyobjectField of study objectYes100%154,667,392
education[].field_of_study.cleanedstringField of study cleaned valueYes60%92,733,786
education[].field_of_study.raw[]arrayField of study raw snippetsYes60%92,733,817
education[].start_datestringEducation start dateYes74%114,988,571
education[].end_datestringEducation end dateYes76%117,811,058
education[].currentbooleanWhether currently enrolled (True=ongoing, False=completed, null=unknown)Yes100%154,667,392
socials[]arraySocial linksNo73%137,716,314
socials[].networkenumSocial networkNo100%167,925,082
socials[].urlstringSocial URLNo100%167,925,082
socials[].idstringSocial IDYes88%147,599,659
socials[].usernamestringSocial usernameYes100%167,925,079
socials[].currentbooleanSocial is currentNo100%167,925,082
socials[].confidenceenumSocial confidence bucketNo100%167,925,082
socials[].num_sourcesintegerNumber of sources that contributed to this socialNo100%167,925,082
socials[].last_seenstringSocial last-seen dateYes69%116,435,256
num_sourcesintegerNumber of data sources for this contactNo100%188,800,817
current_jobs_last_updatedstringDate when current job data last changedYes37%69,479,634
current_jobs_last_confirmedstringDate when current job data was last confirmedYes37%69,479,634
current_location_last_updatedstringDate when current location data last changedYes75%142,050,263
current_location_last_confirmedstringDate when current location data was last confirmedYes75%141,864,641
last_seenstringDate when this record was last seenYes96%181,018,719
build_versionstringBuild version identifierNo100%188,800,817