Jump to Content
David X. Chan

David X. Chan

Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Approaches for Secure Sales Lift Measurement
    Jerry Friedman
    Jessica Hwang
    Jim Koehler
    Yunxiao Li
    Google LLC (2021)
    Preview abstract We propose a number of potential approaches to enable Sales Lift measurement in a privacy-safe and secure manner. We discuss these approaches in the context of digital publisher media channels, but in theory these approaches can be extended to most types of media channels that has the ability to link media exposure with outcomes. We discuss Sales Lift measurement both in the context of single publisher and multi-publisher scenarios and weigh the trade-off of the different solutions in terms of utility, privacy, security, and computation costs. View details
    Preview abstract In this paper, we compare a variety of methods for causal inference through simulation, examining their sensitivity to and asymptotic behavior in the presence of correlation between (heterogeneous) treatment effect size and propensity to be treated, as well as their robustness to model mis-specification. We limit our focus to well-established methods relevant to the estimation of sales lift, which initially motivated this paper and serves as an illustrative example throughout. We demonstrate that popular matching methods often fail to adequately debias lift estimates, and that even doubly robust estimators, when naively implemented, fail to deliver statistically valid confidence intervals. The culprit is inadequate standard error estimators, which often yield insufficient confidence interval coverage because they fail to take into account uncertainty at early stages of the causal model. As an alternative, we discuss a more reliable approach: the use of a doubly robust point estimator with a sandwich standard error estimator. View details
    Preview abstract Evaluating the return on ad spend (ROAS), the causal effect of advertising on sales, is critical to advertisers for understanding the performance of their existing marketing strategy as well as how to improve and optimize it. Media Mix Modeling (MMM) has been used as a convenient analytical tool to address the problem using observational data. However it is well recognized that MMM suffers from various fundamental challenges: data collection, model specification and selection bias due to ad targeting, among others (Chan & Perry 2017; Wolfe 2016). In this paper, we study the challenge associated with measuring the impact of search ads in MMM, namely the selection bias due to ad targeting. Using causal diagrams of the search ad environment, we derive a statistically principled method for bias correction based on the back-door criterion (Pearl 2013). We use case studies to show that the method provides promising results by comparison with results from randomized experiments. We also report a more complex case study where the advertiser had spent on more than a dozen media channels but results from a randomized experiment are not available. Both our theory and empirical studies suggest that in some common, practical scenarios, one may be able to obtain an approximately unbiased estimate of search ad ROAS. View details
    Preview abstract One of the major problems in developing media mix models is that the data that is generally available to the modeler lacks sufficient quantity and information content to reliably estimate the parameters in a model of even moderate complexity. Pooling data from different brands within the same product category provides more observations and greater variability in media spend patterns. We either directly use the results from a hierarchical Bayesian model built on the category dataset, or pass the information learned from the category model to a brand-specific media mix model via informative priors within a Bayesian framework, depending on the data sharing restriction across brands. We demonstrate using both simulation and real case studies that our category analysis can improve parameter estimation and reduce uncertainty of model prediction and extrapolation. View details
    Preview abstract Media mix modeling is a statistical analysis on historical data to measure the return on investment (ROI) on advertising and other marketing activities. Current practice usually utilizes data aggregated at a national level, which often suffers from small sample size and insufficient variation in the media spend. When sub-national data is available, we propose a geo-level Bayesian hierarchical media mix model (GBHMMM), and demonstrate that the method generally provides estimates with tighter credible intervals compared to a model with national level data alone. This reduction in error is due to having more observations and useful variability in media spend, which can protect advertisers from unsound reallocation decisions. Under some weak conditions, the geo-level model can reduce the ad targeting bias. When geo-level data is not available for all the media channels, the geo-level model estimates generally deteriorate as more media variables are imputed using the national level data View details
    Bayesian Methods for Media Mix Modeling with Carryover and Shape Effects
    Jim Koehler
    research.google.com, Google Inc., 76 Ninth Avenue Google New York NY 10011 (2017)
    Preview abstract Media mix models are used by advertisers to measure the effectiveness of their advertising and provide insight in making future budget allocation decisions. Advertising usually has lag effects and diminishing returns, which are hard to capture using linear regression. In this paper, we propose a media mix model with flexible functional forms to model the carryover and shape effects of advertising. The model is estimated using a Bayesian approach in order to make use of prior knowledge accumulated in previous or related media mix models. We illustrate how to calculate attribution metrics such as ROAS and mROAS from posterior samples on simulated data sets. Simulation studies show that the model can be estimated very well for large size data sets, but prior distributions have a big impact on the posteriors when the sample size is small and may lead to biased estimates. We apply the model to data from a shampoo advertiser, and use Bayesian Information Criterion (BIC) to choose the appropriate specification of the functional forms for the carryover and shape effects. We further illustrate that the optimal media mix based on the model has a large variance due to the variance of the parameter estimates. View details
    Challenges and Opportunities in Media Mix Modeling
    Mike Perry
    research.google.com, 76 Ninth Avenue Google New York NY 10011 (2017)
    Preview abstract Advertisers have a need to understand the effectiveness of their media spend in driving sales in order to optimize budget allocations. Media mix models are a common and widely used approach for doing so. The paper outlines the various challenges such models encounter in consistently providing valid answers to the advertiser’s questions on media effectiveness. The paper also discusses opportunities for improvements in media mix models that can produce better inference. View details
    Preview abstract In an earlier study, we reported that on average 89% of the visits to the advertiser’s site from search ad clicks were incremental. In this research, we examine how the ranking of an advertiser’s organic listings on the search results page affects the incrementality of ad clicks expressed through Incremental Ad Clicks (IAC) and as estimated by Search Ads Pause models. A meta-analysis of 390 Search Ads Pause studies highlights the limited opportunity for clicks from organic search results to substitute for ad clicks when the ads are turned off. On average, 81% of ad impressions and 66% of ad clicks occur in the absence of an associated organic search result. We find that having an associated organic search result in rank one does not necessarily mean a low IAC. On average, 50% of the ad clicks that occur with a top rank organic result are incremental, compared to 100% of the ad clicks being incremental in the absence of an associated organic result. View details
    Preview abstract Advertisers often wonder whether search ads cannibalize their organic traffic. In other words, if search ads were paused, would clicks on organic results increase, and make up for the loss in paid traffic? Google statisticians recently ran over 400 studies on paused accounts to answer this question. In what we call “Search Ads Pause Studies”, our group of researchers observed organic click volume in the absence of search ads. Then they built a statistical model to predict the click volume for given levels of ad spend using spend and organic impression volume as predictors. These models generated estimates for the incremental clicks attributable to search ads (IAC), or in other words, the percentage of paid clicks that are not made up for by organic clicks when search ads are paused. The results were surprising. On average, the incremental ad clicks percentage across verticals is 89%. This means that a full 89% of the traffic generated by search ads is not replaced by organic clicks when ads are paused. This number was consistently high across verticals. View details
    Incremental Clicks: The Impact of Search Advertising
    Jim Koehler
    Deepak Kumar
    Journal of Advertising Research, vol. 51, no. 4 (2011), pp. 643-647
    Preview abstract In this research, the authors examined how the number of organic clicks changed when search ads were present and when search ad campaigns were turned off. The authors developed a statistical model to estimate the fraction of total clicks that could be attributed to search advertising. A meta-analysis of several hundred of these studies revealed that more than 89 percent of the ads clicks were incremental, in the sense that those visits to the advertiser's site would not have occurred without the ad campaigns. View details
    Evaluating Online Ad Campaigns in a Pipeline: Causal Models at Scale
    Rong Ge
    Ori Gershony
    Tim Hesterberg
    Diane Lambert
    Proceedings of ACM SIGKDD 2010, pp. 7-15
    Preview abstract Display ads proliferate on the web, but are they effective? Or are they irrelevant in light of all the other advertising that people see? We describe a way to answer these questions, quickly and accurately, without randomized experiments, surveys, focus groups or expert data analysts. Doubly robust estimation protects against the selection bias that is inherent in observational data, and a nonparametric test that is based on irrelevant outcomes provides further defense. Simulations based on realistic scenarios show that the resulting estimates are more robust to selection bias than traditional alternatives, such as regression modeling or propensity scoring. Moreover, computations are fast enough that all processing, from data retrieval through estimation, testing, validation and report generation, proceeds in an automated pipeline, without anyone needing to see the raw data. View details
    No Results Found