Google Flu Trends gets a brand new engine

October 31, 2014

Posted by Christian Stefansen, Senior Software Engineer

Each year the flu kills thousands of people and affects millions around the world. So it’s important that public health officials and health professionals learn about outbreaks as quickly as possible. In 2008 we launched Google Flu Trends in the U.S., using aggregate web searches to indicate when and where influenza was striking in real time. These models nicely complement other survey systems—they’re more fine-grained geographically, and they’re typically more immediate, up to 1-2 weeks ahead of traditional methods such as the CDC’s official reports. They can also be incredibly helpful for countries that don’t have official flu tracking. Since launching, we’ve expanded Flu Trends to cover 29 countries, and launched Dengue Trends in 10 countries.

The original model performed surprisingly well despite its simplicity. It was retrained just once per year, and typically used only the 50 to 300 queries that produced the best estimates for prior seasons. We then left it to perform through the new season and evaluated it at the end. It didn’t use the official CDC data for estimation during the season—only in the initial training.

In the 2012/2013 season, we significantly overpredicted compared to the CDC’s reported U.S. flu levels. We investigated and in the 2013/2014 season launched a retrained model (still using the original method). It performed within the historic range, but we wondered: could we do even better? Could we improve the accuracy significantly with a more robust model that learns continuously from official flu data?

So for the 2014/2015 season, we’re launching a new Flu Trends model in the U.S. that—like many of the best performing methods [1, 2, 3] in the literature—takes official CDC flu data into account as the flu season progresses. We’ll publish the details in a technical paper soon. We look forward to seeing how the new model performs in 2014/2015 and whether this method could be extended to other countries.

As we’ve said since 2009, "This system is not designed to be a replacement for traditional surveillance networks or supplant the need for laboratory-based diagnoses and surveillance." But we do hope it can help alert health professionals to outbreaks early, and in areas without traditional monitoring, and give us all better odds against the flu.

Stay healthy this season!