EvaNet: A Family of Diverse, Fast and Accurate Video Architectures
Abstract
We present a novel evolutionary algorithm that automatically constructs architectures of layers exploring space-time interactions for videos. The discovered architectures are accurate, diverse and efficient. Ensembling such models leads to further accuracy gains and yields faster and more accurate solutions than previous state-of-the-art models. Evolved models can be used across datasets and to build more powerful models for video understanding.