Jump to Content

Job Type Extraction for Service Businesses

Yaping Qi
Hayk Zakaryan
Yonghua Wu
Companion Proceedings of the ACM Web Conference 2023

Abstract

Google My Business (GMB) is a platform that allows business owners to manage their business profiles, which will be displayed when a user issues a relevant query on Google Search or Maps. Many GMB businesses provide diverse services from home cleaning and plumbing to legal services and education. However the exact service content, which we call job types, is often missing in their profiles. This leaves the burden of finding such content to the users, either by the tedious work of scanning through business websites or time-consuming calling of the owners. In the present paper, we describe how we build a pipeline to automatically extract the job types from websites of business owners and how we solve scalability issues for deployment. Rather than focusing on developing novel and sophisticated machine learning models, we share various challenges we have faced and practical experiences of building such a pipeline, including the cold start problem of dataset collection with limited human annotation resource, scalability, reaching a launch bar of high precision, and building a general pipeline with reasonable coverage of any free-text web pages without relying on the Document Object Model (DOM) structure. With these challenges, standard approaches for information extraction do not directly apply or are not scalable to be served. In this paper, we show how we address these challenges in different stages of the extraction pipeline, including: (1) utilizing structured content like tables and lists to tackle the cold start problem of dataset collection; (2) exploitation of various context information to improve model performance without hurting scalability; and (3) formulating the extraction problem as a retrieval task to improve generalizability, efficiency as well as coverage. The pipeline has been successfully deployed, and is scalable enough to be refreshed every few days to extract the latest online information. The extracted job types are serving millions of users of Google Search and Google Maps with at least three use cases: (1) job types of a place are directly displayed on mobile devices; (2) job types provide explanation as to why a place shows up given a query; (3) job types are used as a signal to rank business places. According to a user survey, the displayed job types has greatly enhanced the probability of a user hiring a service provider.