Jump to Content

Assessing the Accuracy of 51 non­probability online panels and river samples: A study of the Advertising Research Foundation 2013 online panel comparison experiment

Yongwei Yang
Ana Villar
Tzuyun Chin
Jon A. Krosnick
71st Annual Conference of the American Association of Public Opinion Research (2016)
Google Scholar


Survey research is increasingly conducted using online panels and river samples. With a large number of data suppliers available, data purchasers need to understand the accuracy of the data being provided and whether probability sampling continues to yield more accurate measurements of populations. This paper evaluates the accuracy of a probability sample and non-­probability survey samples that were created using various different quota sampling strategies and sample sources (panel versus river samples) on the accuracy of estimates. Data collection was organized by the Advertising Research Foundation (ARF) in 2013. We compare estimates from 45 U.S. online panels of non-­probability samples, 6 river samples, and one RDD telephone sample to high-­quality benchmarks ­­ population estimates obtained from large-­scale face-­to-­face surveys of probability samples with extremely high response rates (e.g., ACS, NHIS, and NHANES). The non-probability samples were supplied by 17 major U.S. providers. Online respondents were directed to a third party website where the same questionnaire was administered. The online samples were created using three quota methods: (A) age and gender within regions; (B) Method A plus race/ethnicity; and (C) Method B plus education. Mean questionnaire completion time was 26 minutes, and the average sample size was 1,118. Comparisons are made using unweighted and weighted data, with different weighting strategies of increasing complexity. Accuracy is evaluated using the absolute average error method, where the percentage of respondents who chose the modal category in the benchmark survey is compared to the corresponding percentage in each sample. The study illustrates the need for methodological rigor when evaluating the performance of survey samples.