Google Research

Google Landmarks Dataset v2


The Google Landmarks dataset v2 (GLD-v2) is a computer vision dataset that aims to foster research in large-scale instance-level recognition. Specifically it poses the task of recognizing human-made and natural landmarks from images. The GLD-v2 was introduced in 2019, in the GLD-v2 paper and a Google AI blog post.

The dataset covers two different applications: landmark recognition and landmark retrieval. The recognition task models an application like Google Lens, where users take pictures of landmarks and want to recognize them. The retrieval task models an application like search-by-image in Google Images, where the goal is to find as many similar images as possible to a query image.

GLD-v2 has more than 5 million images, from about 200 thousand unique landmarks from around the world, being much larger than previous existing datasets for this problem. The dataset was the basis for six Google Landmark Recognition and Retrieval challenges, held on Kaggle:

The challenges were part of the Landmark Recognition and Instance-Level Recognition workshops at CVPR'19, ECCV'20, ICCV'21.