Advances in computer vision have made it possible to train predictive algorithms on both structured and unstructured features of image data sets. With good predictive algorithms and robust image data, we can hope to extrapolate even relatively small samples of income or survey data over large areas. In recent work, we have used Google Street View imagery to develop measures of predicted income, housing prices, and similar metrics at the block level in U.S. cities.
Computer vision-based urban audit technology is useful in the United States, but may be even more valuable for developing countries, in which census data is often unavailable, and researchers frequently spend large amounts of money collecting survey data. We are thus attempting to develop computer vision technologies that enable development researchers to scale their research surveys in locations where Google Street View (or similar) image data is available.
Computer vision tools for survey extrapolation could improve the power of development economics studies, by enabling researchers to scale their work beyond the directly surveyed population. In concert, computer vision tools for survey extrapolation may reduce the overall costs of surveying developing-world populations, and could in principle be used by governments at complements to existing census efforts.
The current project supports a pilot project; in particular, it will provide funds supporting geocodes that will link Street View images to an ongoing J-PAL demographic and health survey in Indonesia.
- J-PAL is conducting a survey on demographics, reported health, and pre-program healthcare utilization in Indonesia, as a component of a healthcare study led by the PI and one of the co-investigators.
- Indonesia has high-quality Google Street View images.
- We will thus examine how well Street View imagery can be used to predict responses to the J-PAL survey. In particular, we will train predictions on part of the surveyed sample in Bandung, measure out-of-sample prediction quality, and then attempt to estimate a map of predicted survey responses in Semarang.