Douglas Aberdeen

Douglas Aberdeen

Doug worked for several years in the field of Reinforcement Learning before joining Google. Within Google he works on Gmail including things like spam detection, but most recently including Priority Inbox.

Research Areas

Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    The Learning Behind Gmail Priority Inbox
    Ondrey Pacovsky
    LCCC : NIPS 2010 Workshop on Learning on Cores, Clusters and Clouds
    Preview
    The War Against Spam: A report from the front line
    Brad Taylor
    Dan Fingal
    NIPS 2007 Workshop on Machine Learning in Adversarial Environments for Computer Security
    Preview abstract Fighting spam is a success story of real-world machine learning. Despite the occasional spam that does reach our inboxes, the overwhelming majority of spam — and there is a lot of it — is positively identified. At the same time, the rarity with which users feel the need to check their spam box for false positives demonstrates a high precision of classification. This paper is an overview of Google’s approach to fighting email abuse with machine learning, and a discussion of some lessons learned. View details
    The Factored Policy-Gradient Planner
    Olivier Buffet
    Journal of Artificial Intelligence Research (JAIR), 173(2008), pp. 722-747
    Preview abstract We present an any-time concurrent probabilistic temporal planner (CPTP) that includes continuous and discrete uncertainties and metric functions. Rather than relying on dy- namic programming our approach builds on methods from stochastic local policy search. That is, we optimise a parameterised policy using gradient ascent. The flexibility of this policy-gradient approach, combined with its low memory use, the use of function approxi- mation methods and factorisation of the policy, allow us to tackle complex domains. This Factored Policy Gradient (FPG) planner can optimise steps to goal, the probability of success, or attempt a combination of both. We compare the FPG planner to other plan- ners on CPTP domains, and on simpler but better studied non-concurrent non-temporal probabilistic planning (PP) domains. We present FPG-ipc, the PP version of the planner which has been successful in the probabilistic track of the fifth international planning competition. View details
    Natural Actor-Critic for Road Traffic Optimisation
    Silvia Richter
    Jin Yu
    Advances in Neural Information Processing Systems, The {MIT} Press, Cambridge, MA(2007)
    FF+FPG: Guiding a Policy-Gradient Planner
    Olivier Buffet
    Proceedings of the Seventeenth International Conference on Automated Planning and Scheduling (ICAPS'07), Providence, USA(2007)
    Concurrent Probabilistic Temporal Planning with Policy-Gradients
    Olivier Buffet
    Proceedings of the Seventeenth International Conference on Automated Planning and Scheduling (ICAPS'07), Providence, USA(2007)
    Policy-Gradients for PSRs and POMDPs
    Olivier Buffet
    Owen Thomas
    Proc. 11th Intl. Conf. on Artificial Intelligence and Statistics (AIstats), Society for Artificial Intelligence and Statistics, San Juan, Puerto Rico(2007)
    Fast Online Policy Gradient Learning with {SMD} Gain Vector Adaptation
    Nicol N. Schraudolph
    Jin Yu
    Advances in Neural Information Processing Systems, The {MIT} Press, Cambridge, MA(2006), pp. 1185-1192
    Preview abstract Reinforcement learning by direct policy gradient estimation is attractive in theory but in practice leads to notoriously ill-behaved optimization problems. We improve its robustness and speed of convergence with stochastic meta-descent, a gain vector adaptation method that employs fast Hessian-vector products. In our experiments the resulting algorithms outperform previously employed online stochastic, offline conjugate, and natural policy gradient methods. View details
    Policy-Gradient for Robust Planning (French)
    O. Buffet
    Actes de la conférence francophone sur l'apprentissage automatique (CAp'06)(2006)
    Policy-Gradient Methods for Planning
    Advances in Neural Information Processing Systems, The {MIT} Press, Cambridge, MA(2006)
    The Factored Policy Gradient planner (IPC-06 Version)
    O. Buffet O.
    Proceedings of the Fifth International Planning Competition(2006)
    Policy-Gradient for Robust Planning
    O. Buffet
    Proceedings of the ECAI'06 Workshop on Planning, Learning and Monitoring with Uncertainty and Dynamic Worlds (PLMUDW'06)(2006)
    Simulation Methods for Uncertain Decision-Theoretic Planning
    O. Buffet
    Proceedings of the IJCAI 2005 Workshop on Planning and Learning in A Priori Unknown or Dynamic Domains
    A Two-Teams Approach for Robust Probabilistic Temporal Planning
    O. Buffet
    Proceedings of the ECML'05 workshop on Reinforcement Learning in Non-Stationary Environments(2005)
    Robust Planning with (L)RTDP
    O. Buffet
    Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI'05)(2005)
    Prottle: A Probabilistic Temporal Planner
    I. Little
    S. Thiébaux
    Proc. AAAI'05(2005)
    Planification robuste avec (L)RTDP
    O. Buffet
    Actes de la conférence francophone sur l'apprentissage automatique (CAp'05)(2005)
    Decision-Theoretic Military Operations Planning
    Sylvie Thiébaux
    Lin Zhang
    Proc. ICAPS, AAAI(2004), pp. 402-411
    Filtered Reinforcement Learning
    Proceedings of the 15th European Conference on Machine Learning, Springer(2004), pp. 27-38
    Policy-Gradient Algorithms for Partially Observable Markov Decision Processes
    Ph.D. Thesis, The Australian National University(2003)
    Scaling Internal-State Policy-Gradient Methods for {POMDP}s
    Jonathan Baxter
    Proceedings of the 19th International Conference on Machine Learning, Morgan Kaufmann, Syndey, Australia(2002)
    Emmerald: A fast Matrix-Matrix Multiply Using {I}ntel {SIMD} Technology
    Jonathan Baxter
    Concurrency and Computation: Practice and Experience, 13(2001), pp. 103-119
    General Matrix-Matrix Multiplication Using {SIMD} features of the {PIII}
    Jonathan Baxter
    Euro-Par 2000: Parallel Processing, Springer-Verlag, Munich, Germany
    92c {/MFlop/s, Ultra-Large-Scale Neural-Network} Training on a {PIII} Cluster
    Jonathan Baxter
    Robert Edwards
    Proceedings of Super Computing 2000, Dallas, TX.