Multimodal Multitask Representation Learning for Pathology Biobank Metadata Prediction

Yuannan Cai
Angela Lin
Fraser Tan
Cameron Chen


Metadata are general characteristics of the data in a well-curated and condensed format, and have been proven to be useful for decision making, knowledge discovery, and also heterogeneous data organization of biobank. Among all data types in the biobank, pathology is the key component of the biobank and also serves as the gold standard of diagnosis. To maximize the capability of biobank and allow the rapid progress of biomedical science, utilizing the pathology metadata is essential yet require enormous expert effort to annotate due to the unstructured nature and complexity of pathology information. In the study, we develop a multimodal multitask learning framework that learns generalizable representations of pathology data to predict four major biobank metadata of the pathology images. We demonstrate that incorporating multimodal information, such as texts and case-level categorical data, improves the metadata prediction performance while multiple downstream tasks are considered simultaneously. Such pathology metadata prediction system may be adopted to mitigate the expert effort of manual annotation and ultimately accelerate the data-driven research by better utilization of the pathology biobank.