Endogenous sample selection in Zambian household surveys
This study emphasises the crucial role of high-quality data in informing policy decisions. The project aims to identify weaknesses in current data collection processes, providing an evaluation on current methodologies and highlighting areas for improvement. Looking specifically at potential indications of endogenous sample selection and creating a reliable database to compare population parameters.
The effectiveness of policies can be significantly enhanced by leveraging high-quality data. It is essential for policymakers to comprehend the reliability and accuracy of foundational data that informs policy decisions. Through a quality assessment of various household surveys, we aim to assess the existing archive of data and identify weaknesses in data production processes, providing an assessment on whether adjustments to data collection processes are warranted.
We aim to conduct a thorough evaluation of household survey data within the Zambian Government with results shared widely among governmental stakeholders. Zambian Statistics Agency Zamstats will also gain a comprehensive understanding of current data quality, enabling them to pinpoint areas for improvements and evaluate their existing data collection methodologies.
Initially our main objective is to gather microdata and documentation from nationally represented surveys. We will identify population parameters that can be estimated consistently across all surveys. The data will undergo cleaning and standardisation to ensure a reliable comparison of population parameters, allowing us to produce a thorough report highlighting discrepancies from various surveys and serving as a resource for future analysis.
The second stage will centre around identifying potential indications of endogenous sample selection by assessing the representation of households and individuals in the data based on varying levels of enumerator effort required. The assessment may prompt investigations into enhancing the monitoring of data quality. In subsequent stages the project may incorporate algorithms to detect data quality issues, creating and evaluating alternate survey designs to alleviate these challenges alongside developing tools for post-analysis corrections.