Towards Fine-Grained Localization of Privacy Behaviors

Vijayanta Jain
Sepideh Ghanavati
Collin McMillan
2023 IEEE 8th European Symposium on Security and Privacy (EuroS&P), pp. 258-277

Abstract

Privacy labels help developers communicate their application's privacy behaviors (i.e., how and why an application uses personal information) to users. But, studies show that developers face several challenges in creating them and the resultant labels are often inconsistent with their application's privacy behaviors. In this paper, we create a novel methodology called fine-grained localization of privacy behaviors to locate individual statements in source code which encode privacy behaviors and predict their privacy labels. We design and develop an attention-based multi-head encoder model which creates individual representations of multiple methods and uses attention to identify relevant statements that implement privacy behaviors. These statements are then used to predict privacy labels for the application's source code and can help developers write privacy statements that can be used as notices. Our quantitative analysis shows that our approach can achieve high accuracy in identifying privacy labels, with the lowest accuracy of 91.41% and the highest of 98.45%. We also evaluate the efficacy of our approach with six software professionals from our university. The results demonstrate that our approach reduces the time and mental effort required by developers to create high-quality privacy statements and can finely localize statements in methods that implement privacy behaviors.