Tarfah Alrashed
I am a Research Scientist at Google Research, where I am part of the Dataset Search team. Our mission is to make structured data on the Web more accessible and useful. We developed Google Dataset Search, a tool that helps users discover datasets stored across the Web.
Before joining Google, I completed my Ph.D. in Computer Science at MIT CSAIL, where my research focused on designing systems that enable users to access and manipulate web data without the need to write code. You can view my pre-Google publications on my Google Scholar profile
Before joining Google, I completed my Ph.D. in Computer Science at MIT CSAIL, where my research focused on designing systems that enable users to access and manipulate web data without the need to write code. You can view my pre-Google publications on my Google Scholar profile
Authored Publications
Sort By
Discovering Datasets on the Web Scale: Challenges and Recommendations for Google Dataset Search
Daniel Russell
Stella Dugall
Harvard Data Science Review (2024)
Preview abstract
With the rise of open data in the last two decades, more datasets are online and more people are using them for projects and research. But how do people find datasets? We present the first user study of Google Dataset Search, a dataset-discovery tool that uses a web crawl and open ecosystem to find datasets. Google Dataset Search contains a superset of the datasets in other dataset-discovery tools—a total of 45 million datasets from 13,000 sources. We found that the tool addresses a previously identified need: a search engine for datasets across the entire web, including datasets in other tools. However, the tool introduced new challenges due to its open approach: building a mental model of the tool, making sense of heterogeneous datasets, and learning how to search for datasets. We discuss recommendations for dataset-discovery tools and open research questions.
View details