Variational quantum algorithms are a leading candidate for early applications on noisy intermediate-scale quantum computers. These algorithms depend on a classical optimization outer-loop that minimizes some function of a parameterized quantum circuit. In practice, finite sampling error and gate errors make this a stochastic optimization with unique challenges that must be addressed at the level of the optimizer. The sharp trade-off between precision and sampling time in conjunction with experimental constraints necessitates the development of new optimization strategies to minimize overall wall clock time in this setting. We introduce an optimization method and numerically compare its performance with common methods in use today. The method is a simple surrogate model-based algorithm designed to improve reuse of collected data. It does so by estimating the gradient using a least-squares quadratic fit of sampled function values within a moving trusted region. To make fair comparisons between optimization methods, we develop experimentally relevant cost models designed to balance efficiency in testing and accuracy with respect to cloud quantum computing systems. The results here underscore the need to both use relevant cost models and optimize hyperparameters of existing optimization methods for competitive performance. We compare tuned methods using cost models presented by superconducting devices accessed through cloud computing platforms. The method introduced here has several practical advantages in realistic experimental settings, and has been used successfully in a separately published experiment on Google's Sycamore device.