Ways of Reducing Coverage and Sampling Error As Part of the Total Survey Error Framework for Establishment Surveys in Europe: Recent Developments

1 Ways of Reducing coverage and sampling error as part of the Total Survey Error Framework for Establishment Surveys in Europe: Recent Developments Nikola Jovanovski - Sample Solutions BV Carsten Broich - Sample Solutions BV www.companyname.com © 2016 Startup theme. All Rights Reserved. 2 CONTENT OVERVIEW Enrichment Methodology Introduction How can we use a Big Data approach to Company Background decrease sampling error in European European Establishment Surveys 4 1 Establishment Surveys Big Data as a crucial part in sampling Pros and Cons Research Sample challenges and Research Problems Benefits and limitations from using the 2 ESRA 2019 Future considerations 5 enrichment approach Establishment Sample sources Conclusion What are the different sources of Research Problem address 6 3 establishment sample data for the EU27 Q&A www.companyname.com © 2016 Startup theme. All Rights Reserved. 3 1. INTRODUCTION www.companyname.com © 2016 Startup theme. All Rights Reserved. 4 About Sample Solutions Background Founded in the Netherlands, back in 2009 with the focus on Business & Consumer Telephone Sample Sample Survey Platform B2B module Specialized B2B Database designed Survey Research -instant counts and sample ordering Multi country B2B sample Projects Eurobarometer, London Economics, PwC, ABB, World Bank, American Express www.companyname.com © 2016 Startup theme. All Rights Reserved. 5 WHAT IS BIG DATA? www.companyname.com © 2016 Startup theme. All Rights Reserved. 6 "Big data" is a field that treats ways to analyze, systematically extract information from, or otherwise deal with data sets that are too large or “ complex to be dealt with by traditional data-processing application software. “ www.companyname.com © 2016 Startup theme. All Rights Reserved. European Establishments 7 Surveys: Flash Eurobarometer (business survey part) European Companies Survey (ECS) European Survey of Enterprises on New and Emerging Risks (ESENER) www.companyname.com © 2016 Startup theme. All Rights Reserved. Sampling frame sources per 8 country: B2B data vendors (finance, marketing,sales purposes) National Business Register Chamber of Commerce Directory www.companyname.com © 2016 Startup theme. All Rights Reserved. 9 2. RESEARCH www.companyname.com © 2016 Startup theme. All Rights Reserved. 10 Sampling Challenge ~60% telephone Issues among the Sampling sources coverage No Phone number included ➔ Decentralized Sample data vending (1 vendor per country) ◆ Under representativeness among certain: ● Industries ● Company Sizes ● Area typologies ● Subsidiaries(Branches) ◆ Variable sample costs among countries ◆ Limited coverage of Contact details of decision makers Entire frame and officers ◆ No or limited coverage of email addresses Wrong/ Outdated Numbers www.companyname.com © 2016 Startup theme. All Rights Reserved. Source: Business Count that includes entire frame with and without phone number records 11 Sample comparison: Phone v Email Establishment sample ➔ Offline - phone ◆ Key breakdown values (Industry code, Size) Phone Email ◆ No direct contact point with decision maker ◆ Overall better response rates • Response rates • Direct approach ➔ Online - email • Coverage among • Desired role Sole Traders • Convenient time ◆ Key breakdown values (Industry code, Size) • Wide availability of interview ◆ Direct email of Decision Maker • Rapid interview • Low Cost ◆ Targeting specific job roles based on research turnaround time • Non opt-in sampling www.companyname.com © 2016 Startup theme. All Rights Reserved. 12 Research questions What are the ways of reducing coverage and sampling error for Establishment Surveys in EU27: 1. How can Big Data be used for increasing the telephone coverage in the Sampling Frame 2. How can the Sampling frame go online, for a mixed mode sampling approach? 3. How can we ensure sample data accuracy and validity in a international and longitudinal study? www.companyname.com © 2016 Startup theme. All Rights Reserved. 13 Future considerations: From ESRA 2019 we have posted these future action steps: ➔ Use Big Data to Identify Company Size(no. of Employees) on businesses with unknown size ➔ Include a validation step in the enrichment process to lower data inaccuracy ➔ Include a Area Typology stratification criteria in order to make a more representable sample to frame ➔ Design sample of all countries from a centralized source ➔ Source Contact person data and email www.companyname.com © 2016 Startup theme. All Rights Reserved. 14 3. ESTABLISHMENT SAMPLE SOURCES www.companyname.com © 2016 Startup theme. All Rights Reserved. ESTABLISHMENT SAMPLE ELEMENTS: 15 ● Country, City, Postcode, Region, HQ Location, Branches 01 LOCATION Location ● SIC code, Keywords (from LinkedIn), NAICS code, NACE 02 INDUSTRY code ● Employee Range 03 COMPANY SIZE ● Revenue Range ● Founded date 04 COMPANY STATUS ● Operating status, Entity Type ● Email available, Phone available, Website URL, Decision 05 CONTACT OPTIONS maker (C-level) 06 TECHNOLOGY ● Technologies divided into categories ● Departments, Job title 07 CONTACTS INFO www.companyname.com● C-level, VP level, Director, Manager, Non - Manager © 2016 Startup theme. All Rights Reserved. 16 Three sources for Establishment sample sources: TRADITIONAL ENRICHMENT GENERATED Designed Designed Designed by by by slidefusion slidefusion slidefusion Email patterns Available lists Online lookups Basic Business Verification of email Big Data information activity www.companyname.com © 2016 Startup theme. All Rights Reserved. 17 Traditional sources National Registers Phone book directories Open and free access for State or Privately run general public Categorized with NACE and SME sizes Commercial Data Chamber of Commerce vendors Rich in business information Paid access Can be Member only For Marketing, Credit report www.companyname.com and Sales purposes © 2016 Startup theme. All Rights Reserved. Rich in detail 18 CHARACTERISTICS OF TRADITIONAL SOURCES: Slidefusion copyright BRAND MANDATORY FREE ACCESS EMAIL ADDRESSES CONTACTS National Business * Register * * Phone book * Commercial Data Vendor * Chamber of Commerce * * * * www.companyname.com © 2016 Startup theme. All Rights Reserved. *Applies to majority of cases in the EU27 19 BIG DATA - ENRICHMENT SOURCES Search Engines Directory lookup 01 02 Social Media Review sites 03 04 www.companyname.com © 2016 Startup theme. All Rights Reserved. 20 Enrichment sources and tools: Search Directory lookups Engine Google Snippet Reverse phone lookups Company name and address search Company Lookups( Company Name + Email address search Address) Geographic lookup Contact person lookup Social Media Review Sites Facebook Company lookup Phone lookup Geographic lookup Linkedin Contact and Emp. size lookup www.companyname.com © 2016 Startup theme. All Rights Reserved. SAMPLE DATA ENRICHMENT 21 SAMPLE DATA ENRICHMENT Technology Contact personnel Sizes Telephone number Misc details Enrichment Decision makers From employee count For missing records From a variety of VAT number, Officers And Reports Incorrect/Outdated 300 Web BusinessDescription, Employees Of branches technologies Listed email addresses www.companyname.com incorporated on the Entity form © 2016 Startup theme. All Rights Reserved. website 22 Which source can produce which data Slidefusion copyright FEATURES Employee size Phone number Contact person Email Address LinkedIn Google Facebook Directory lookup Yelp & Tripadvisor www.companyname.com © 2016 Startup theme. All Rights Reserved. 23 Overall Enrichment rates among elements 75% 60% 40% 20% Phone Employee size Contacts Email address number www.companyname.com © 2016 Startup theme. All Rights Reserved. 24 Challenges in Big Data enrichment Native encoding Incorrect Location Wrong or no address Recognition and acceptance of information reduces letters such as Greek, Cyrillic, enrichment productivity Nordic etc. Registration Name Inactive businesses Legal Name ≠ Merchant Input frames Name includes potential www.companyname.com dead records © 2016 Startup theme. All Rights Reserved. GENERATED SAMPLE SOURCE: 25 EMAIL ADDRESS SAMPLING METHODOLOGY Input Data Sourced Company URL to create email domain Contact person Name Generating Patterns Apply input data to the pattern Combinations eg. fi[email protected] [email protected] fi[email protected] Validation of email activity Screen all combinations using a SMTP service to see whether the email server recognizes the email addresses www.companyname.com © 2016 Startup theme. All Rights Reserved. 26 Email Categorization [email protected] Personal [email protected] General Categorization: Lookup table for CompanyEnrichmen general t Email Name database for personal Freemail Provider list Freemail [email protected] Personal [email protected] General www.companyname.com © 2016 Startup theme. All Rights Reserved. 27 SAMPLE DATA VERIFICATION VERIFIED NAME EMAILS NOT VERIFIED NAME NO STATUS DATA VERIFICATION NAME WORKING NOT WORKING URLS NAME REDIRECTING NAME NO STATUS www.companyname.com © 2016 Startup theme. All Rights Reserved. 28 5. ENRICHMENT METHODOLOGY www.companyname.com © 2016 Startup theme. All Rights Reserved. 29 Outline European enterprises and establishments sampling featured in both online and offline sampling modes Full Sampling method ➔ Mixed Mode Sample (Online and Offline): defined by: ◆ Direct and/or Company Email ◆ Company phone ➔ Merging and blending multiple business sample sources for a country into one frame ensuring maximum coverage among the EU27 ➔ Collected email address, contact person data and

Ways of Reducing Coverage and Sampling Error As Part of the Total Survey Error Framework for Establishment Surveys in Europe: Recent Developments

Esomar/Grbn Guideline for Online Sample Quality

SAMPLING DESIGN & WEIGHTING in the Original

The Effect of Sampling Error on the Time Series Behavior of Consumption Data*

9Th GESIS Summer School in Survey Methodology August 2020

13 Collecting Statistical Data

5.1 Survey Frame Methodology

American Community Survey Accuracy of the Data (2018)

Journal of Econometrics an Empirical Total Survey Error Decomposition

American Community Survey Design and Methodology (January 2014) Chapter 15: Improving Data Quality by Reducing Non-Sampling Erro

The Evidence from World Values Survey Data

Observational Studies and Bias in Epidemiology

A Total Survey Error Perspective on Comparative Surveys