Skip to main content
Statistics on field coverage, confidence distribution, and dataset composition.

Dataset Snapshot

  • Total contacts: 188,844,186

Confidence Distribution (linkage confidence)

AttributehighmoderatelowTotal
Phones59,319,17314,324,97815,716,35189,360,502
Emails149,662,90750,429,29765,771,179265,863,383
Locations115,253,785103,550,609140,961,390359,765,784
Socials134,724,55417,499,92215,755,936167,980,412

Coverage Summary

FieldTypeDescriptionNullableFill RateCount
legion_idstringStable contact identifierNo100%188,844,186
full_namestringBest-available full nameNo100%188,844,186
first_namestringFirst nameNo100%188,844,186
middle_namestringMiddle name (non-initial)Yes5%8,939,249
middle_initialstringMiddle initial (provided or derived)Yes15%28,965,191
last_namestringLast nameNo100%188,844,186
last_initialstringLast name initialNo100%188,844,186
suffixstringName suffixYes1%2,261,508
prefixstringName prefixYes0%113,262
sexenumSex valueYes90%169,955,797
birth_datestringBirth date (precision preserved)Yes33%62,787,442
birth_yearintegerBirth yearYes33%62,787,442
birth_monthintegerBirth monthYes33%62,723,757
birth_dayintegerBirth dayYes1%1,907,180
ageintegerAge in yearsYes33%62,787,442
job_titlestringCurrent job titleYes66%124,727,026
company_legion_idstringCurrent company Data Legion IDYes44%83,558,147
company_namestringCurrent company nameYes64%121,103,373
company_domainstringCurrent company website domainYes43%80,605,032
company_linkedin_urlstringCurrent company LinkedIn URLYes45%84,997,723
company_linkedin_idstringCurrent company LinkedIn numeric IDYes44%83,558,147
company_industryenumCurrent company industry (see enums/industry.csv)Yes43%80,912,203
company_sizeenumCurrent company size (see enums/company_size.csv)Yes45%84,406,878
seniority_levelenumSeniority level (c_level/owner/partner/vp/director/manager/senior/junior/training/intern)Yes27%51,709,634
job_functionenumJob function (engineering/sales/marketing/healthcare/education/etc.)Yes66%124,638,607
expense_categoryenumP&L expense category (see enums/expense_category.csv)Yes66%124,638,607
is_decision_makerbooleanTrue if seniority is c_level/owner/partner/vp/directorYes100%188,844,186
is_platform_workerbooleanTrue if current job is platform/gig usage (e.g. Airbnb host, Uber driver), not employmentYes100%188,844,186
years_of_experienceintegerYears since earliest job start dateYes41%77,995,635
avg_tenure_monthsnumberAverage job tenure in months across all experience entriesYes41%77,951,080
highest_degree_levelenumHighest education level (doctorate/masters/bachelors/associates/high_school)Yes27%51,233,719
work_emailstringCurrent work email addressYes31%58,270,994
mobile_phonestringCurrent mobile phone numberYes24%45,727,486
citystringCurrent cityYes85%160,770,041
statestringCurrent state/provinceYes85%161,028,726
state_codestringCurrent state ISO 3166-2 code (e.g., US-CA, GB-ENG)Yes85%161,025,333
countrystringCurrent countryYes85%161,128,947
country_codestringCurrent country ISO 3166-1 alpha-2 code (e.g., US, GB)Yes85%161,128,947
linkedin_urlstringPrimary LinkedIn profile URLYes70%132,997,528
linkedin_idstringLinkedIn numeric member IDYes66%125,181,123
linkedin_followersintegerLinkedIn follower countYes62%116,210,135
linkedin_connectionsintegerLinkedIn connection countYes62%116,210,823
headlineobjectObject containing headline text and raw snippetsNo63%119,001,638
headline.cleanedstringHeadline textYes63%119,001,638
headline.raw[]arrayHeadline raw snippetsNo63%119,001,638
summaryobjectObject containing summary text and raw snippetsNo20%37,467,627
summary.cleanedstringSummary textYes20%37,467,627
summary.raw[]arraySummary raw snippetsNo20%37,467,627
skills[]arrayArray of skills objectsNo26%48,812,490
skills[].raw[]arraySkills raw snippetsNo100%1,025,104,629
skills[].cleanedstringSkills textYes100%1,025,104,629
languages[]arrayArray of languages objectsNo11%21,466,784
languages[].raw[]arrayLanguages raw snippetsNo100%34,184,009
languages[].cleanedstringLanguages textYes100%34,184,009
languages[].proficiencyenumLanguages proficiencyYes37%12,537,316
certifications[]arrayArray of certification objectsNo7%13,125,848
certifications[].name.cleanedstringNormalized certification nameYes100%33,049,552
certifications[].name.raw[]arrayRaw certification name variantsNo100%33,049,552
certifications[].institution.cleanedstringNormalized issuing institutionYes95%31,259,320
certifications[].institution.raw[]arrayRaw issuing institution variantsNo95%31,259,320
certifications[].credential_idstringCredential IDYes14%4,778,019
certifications[].issue_datestringIssue dateYes78%25,904,290
certifications[].expiration_datestringExpiration dateYes11%3,755,616
phones[]arrayContact has ≥1 phoneNo37%70,758,678
phones[].typeenumPhone typeNo100%89,360,502
phones[].numberstringPhone number presentNo100%89,360,502
phones[].currentbooleanPhone flagged currentNo100%89,360,502
phones[].confidenceenumPhone confidence bucketNo100%89,360,502
phones[].num_sourcesintegerNumber of sources that contributed to this phoneNo100%89,360,502
phones[].last_seenstringPhone last-seen dateYes75%66,930,727
emails[]arrayContact has ≥1 emailNo80%150,138,059
emails[].addressstringEmail addressNo100%265,863,383
emails[].typeenumEmail typeYes100%265,863,383
emails[].currentboolean/nullCurrent status (null for personal emails)Yes100%265,863,383
emails[].validatedbooleanEmail validatedNo100%265,863,383
emails[].validation_statusenumValidation statusYes11%28,662,267
emails[].confidenceenumEmail confidence bucketNo100%265,863,383
emails[].num_sourcesintegerNumber of sources that contributed to this emailNo100%265,863,383
emails[].last_seenstringEmail last-seen dateYes92%245,673,994
emails[].hash_sha256stringSHA-256 hash of normalized email addressYes100%265,863,383
emails[].hash_sha1stringSHA-1 hash of normalized email addressYes100%265,863,383
emails[].hash_md5stringMD5 hash of normalized email addressYes100%265,863,383
locations[]arrayContact has ≥1 normalized locationsNo100%188,596,393
locations[].street_addressstringStreet line presentYes73%263,454,949
locations[].address_line_2stringAddress line 2 presentYes9%33,748,516
locations[].citystringCity presentYes100%358,197,563
locations[].statestringState/province name (lowercase)Yes96%346,132,360
locations[].state_codestringISO 3166-2 code (e.g., US-NY)Yes96%345,574,647
locations[].countrystringCountry name (lowercase)Yes99%356,202,933
locations[].country_codestringISO 3166-1 alpha-2 (e.g., US)Yes99%356,196,571
locations[].postal_codestringPostal code presentYes75%269,259,439
locations[].postal_code_4stringZIP+4 suffix (4-digit, US only)Yes32%113,597,090
locations[].continentstringContinent nameYes99%356,196,513
locations[].continent_codestringContinent codeYes99%356,196,513
locations[].geostringLatitude,longitude coordinatesYes62%223,558,594
locations[].raw[]arrayRaw location from the sourceNo100%359,765,747
locations[].currentbooleanLocation flagged currentNo100%359,765,784
locations[].confidenceenumLocation confidence bucketYes100%359,765,784
locations[].num_sourcesintegerNumber of sources that contributed to this locationNo100%359,765,784
locations[].last_seenstringLocation last-seen dateYes60%217,639,270
experience[]arrayExperience entriesNo70%131,660,054
experience[].titleobjectExperience title objectYes100%422,582,783
experience[].title.cleanedstringExperience title cleaned valueYes100%422,375,605
experience[].title.raw[]arrayExperience title raw snippetsYes100%422,375,640
experience[].organizationobjectExperience organization objectYes100%422,582,783
experience[].organization.legion_idstringExperience organization Data Legion company IDYes75%316,462,326
experience[].organization.nameobjectExperience organization name objectYes100%422,582,783
experience[].organization.name.cleanedstringExperience organization name cleaned valueYes99%417,836,504
experience[].organization.name.raw[]arrayExperience organization name raw snippetsYes99%417,938,161
experience[].organization.websitestringExperience organization website URLYes72%304,527,249
experience[].organization.linkedin_urlstringExperience organization LinkedIn URLYes76%320,533,450
experience[].organization.linkedin_idstringExperience organization LinkedIn numeric IDYes75%316,462,326
experience[].organization.industryenumExperience organization industry (see enums/industry.csv)Yes72%305,453,343
experience[].organization.sizeenumExperience organization size (see enums/company_size.csv)Yes75%317,124,885
experience[].start_datestringExperience start dateYes85%359,128,404
experience[].end_datestringExperience end dateYes63%266,621,376
experience[].currentbooleanExperience current flagYes85%359,272,818
experience[].tenure_monthsintegerJob tenure in months (calculated from start_date and end_date)Yes85%359,105,989
experience[].seniority_levelenumSeniority level classification (see enums/seniority_level.csv)Yes44%187,244,808
experience[].job_functionenumJob function classification (see enums/job_function.csv)Yes100%422,139,509
experience[].expense_categoryenumP&L expense category (see enums/expense_category.csv)Yes100%422,139,509
experience[].is_decision_makerbooleanTrue if seniority is c_level/owner/partner/vp/directorYes100%422,582,783
experience[].is_platform_workerbooleanTrue if role is platform/gig usage (e.g. Airbnb host, Uber driver), not employmentNo100%422,582,783
experience[].descriptionobjectExperience description objectYes100%422,582,783
experience[].description.cleanedstringExperience description textYes44%187,734,474
experience[].description.raw[]arrayExperience description raw snippetsYes44%187,777,558
education[]arrayEducation entriesNo39%73,749,914
education[].organizationobjectEducation organization objectYes100%155,136,028
education[].organization.nameobjectEducation organization name objectYes100%155,136,028
education[].organization.name.cleanedstringEducation organization name cleaned valueYes100%155,136,020
education[].organization.name.raw[]arrayEducation organization name raw snippetsYes100%155,136,028
education[].organization.linkedin_urlstringEducation organization LinkedIn URLYes79%123,034,846
education[].organization.websitestringEducation organization website URLYes50%77,187,419
education[].degreeobjectEducation degree objectYes100%155,136,028
education[].degree.cleanedstringEducation degree cleaned valueYes68%105,304,427
education[].degree.raw[]arrayEducation degree raw snippetsYes68%105,304,483
education[].degree_levelenumDegree level (see enums/degree_level.csv)Yes55%85,421,225
education[].field_of_studyobjectField of study objectYes100%155,136,028
education[].field_of_study.cleanedstringField of study cleaned valueYes60%93,016,565
education[].field_of_study.raw[]arrayField of study raw snippetsYes60%93,016,596
education[].start_datestringEducation start dateYes74%115,557,171
education[].end_datestringEducation end dateYes76%118,434,042
education[].currentbooleanWhether currently enrolled (True=ongoing, False=completed, null=unknown)Yes100%155,136,028
socials[]arraySocial linksNo73%137,757,685
socials[].networkenumSocial networkNo100%167,980,412
socials[].urlstringSocial URLNo100%167,980,412
socials[].idstringSocial IDYes88%147,691,787
socials[].usernamestringSocial usernameYes100%167,980,409
socials[].currentbooleanSocial is currentNo100%167,980,412
socials[].confidenceenumSocial confidence bucketNo100%167,980,412
socials[].num_sourcesintegerNumber of sources that contributed to this socialNo100%167,980,412
socials[].last_seenstringSocial last-seen dateYes70%116,793,546
num_sourcesintegerNumber of data sources for this contactNo100%188,844,186
current_jobs_last_updatedstringDate when current job data last changedYes37%69,514,819
current_jobs_last_confirmedstringDate when current job data was last confirmedYes37%69,514,819
current_location_last_updatedstringDate when current location data last changedYes75%142,304,685
current_location_last_confirmedstringDate when current location data was last confirmedYes75%142,118,858
last_seenstringDate when this record was last seenYes96%181,061,809
build_versionstringBuild version identifierNo100%188,844,186