Sai Teja Peddinti

Sai Teja Peddinti

Sai Teja Peddinti is a Research Scientist in the Infrastructure Security and Privacy group at Google. He received his PhD in Computer Science from New York University in 2015. The focus of his PhD work was in large scale data-driven analysis to understand user privacy preferences and concerns, and to evaluate effectiveness of privacy solutions. His research interests are in privacy, machine learning, network and cloud security, and cryptography.
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
"You Have Been Selected as the Winner": Characterizing User-Reported Scams on TikTok
Smirity Kaushik
Kyle Beadle
Gauri Nayak
Madelyn Rose Sanfilippo
Mainack Mondal
Yang Wang
JingJei Li
Yixin Zou
USENIX Symposium on Usable Privacy and Security (SOUPS) (2026)
Preview abstract Short-form video platforms (SVPs) such as TikTok have grown rapidly in popularity. While online scams have been extensively studied, the extent to which they take new forms on SVPs and the discourses around them remain understudied. Using TikTok as a case study, we analyzed 150 videos in which content creators reported scam experiences and offered anti-scam advice. We focus on how TikTok users (creators, followers, and commenters) discuss scams, rather than analyzing scams. Our analysis surfaces six types of scams, including creator impersonation and account badge verification scams that target TikTok's influencer-follower ecosystem. Scammers also exploit platform-specific features (e.g., direct messaging or the "For You Page") to lure victims. In response, TikTok users share strategies to identify scammer profiles and communication cues, building community support through anti-scam advice. Based on our findings, we offer recommendations for systemizing platform support to combat scams and leveraging the influencer ecosystem to raise awareness. View details
LLM-Powered Analysis of IoT User Reviews: Tracking and Ranking Security and Privacy Concerns
Taufiq Islam Protick
Anupam Das
Proceedings of the International AAAI Conference on Web and Social Media (ICWSM) (2026)
Preview abstract Being able to understand the security and privacy (S&P) concerns of IoT users brings benefits to both developers and users. To learn about users' views, we examine Amazon IoT reviews - one of the biggest IoT markets. This work presents a state-of-the-art methodology to identify and categorize reviews in which users express S&P concerns. We developed an automated pipeline by fine-tuning GPT-3.5-Turbo to build two models: the Classifier-Rationalizer-Categorizer and the Thematic Mapper. By leveraging dynamic few-shot prompting and the model's large context size, our pipeline achieved over 97% precision and recall, significantly outperforming keyword-based and classical ML methods. We applied our pipeline to 91K Amazon reviews about fitness trackers, smart speakers and cameras, over multiple years. We found that on average 5% contained S&P concerns, while security camera exhibited the highest prevalence at 10%. Our method detected significantly more S&P-relevant reviews than prior works: 15x more for fitness trackers, 29% more for smart speakers, and 70% more for cameras. Our longitudinal analysis reveals that concerns like surveillance and data control have persisted for years, suggesting limited industry progress. We demonstrate that across all device types, users consistently demand more precise control over what data is collected and shared. We uncover challenges in multi-user and multi-device interactions, identifying two previously unreported themes concerning inadequate controls for account separation and data access. These findings, ranging from broad persistent trends to specific instances of customer loss, offer actionable insights for developers to improve user satisfaction and trust. View details
What’s on My Network? Using Large Language Models to Identify Real-World IoT Devices at Scale
Rameen Mahmood
Danny Yuxing Huang
Proceedings of ACM International Conference on Emerging Networking Experiments and Technologies (CoNEXT), Association for Computing Machinery (2026)
Preview abstract The growth of IoT devices in shared environments has outpaced our ability to identify them, posing urgent risks to privacy, safety, and accountability. This challenge is especially pronounced in open‑world environments, where network traffic metadata is often sparse, noisy, or adversarial. To address this problem, we introduce a semantic inference pipeline that reframes device identification as a language modeling task over real‑world network metadata. As this approach depends on reliable supervision, we first construct high‑fidelity vendor labels for the IoT Inspector dataset—the largest real‑world corpus of its kind—using an ensemble of large language models guided by mutual‑information and entropy‑based stability scores. We then instruction-tune a quantized LLaMA 3.1 8B model on this dataset using curriculum learning to support generalization under sparsity and long-tail vendor distributions. Our model achieves 98.69% top-1 and 90.73% macro accuracy across 2,015 vendors, while remaining robust to missing fields, protocol drift, and adversarial manipulation. We also evaluate the model on an independent IoT testbed dataset, assess explanation quality, and conduct adversarial tests to probe robustness under spoofed and obfuscated input. These results position instruction-tuned LLMs as a scalable, interpretable foundation for trustworthy device identification at scale. View details
Understanding U.S. Users' Security and Privacy Transparency Needs for Consumer-Facing Generative AI
Jiaxun Cao
Yu Dong
Chunxi Zhan
Rithvik Neti
Pardis Emami-Naeini
USENIX Symposium on Usable Privacy and Security (SOUPS) (2026)
Preview abstract Users increasingly rely on consumer-facing generative AI (GenAI) for tasks ranging from everyday needs to sensitive use cases. Yet, it remains unclear whether and how existing security and privacy (S&P) communications in GenAI tools shape users’ adoption decisions and experiences. Understanding how users seek, interpret, and evaluate S&P information is critical for designing usable transparency that users can trust and act on. We conducted semi-structured interviews and design sessions with 21 U.S. GenAI users. Our findings suggest that available S&P information rarely drove initial adoption in practice, as participants often perceived it as incomplete, ineffective, or not credible. Instead, they relied on rough proxies (e.g., popularity) to infer S&P practices. After adoption, S&P uncertainty constrained participants’ willingness to use GenAI tools, especially for high-stakes purposes, and, in some cases, contributed to discontinued use. Participants therefore called for transparency that supports decisions and actions through trustworthy information (e.g., independent evaluations) and usable interfaces (e.g., on-demand disclosure). We categorize participants’ desired design practices into five dimensions to facilitate systematic future investigation into best practices. We conclude with recommendations for researchers, designers, and policymakers to improve S&P transparency in consumer-facing GenAI. View details
Nudging Developers Toward Privacy: Evaluating the Impact of Personalized App Review Reports
Omer Akgul
Michelle L. Mazurek
USENIX Symposium on Usable Privacy and Security (SOUPS) (2026)
Preview abstract Mobile application developers often struggle to create accurate privacy notices or implement robust privacy practices due to limited expertise or resources. While users share unsolicited privacy feedback in app reviews, and prior research has characterized this privacy feedback, uncovering developer reactions to this feedback remains unexplored. This study explores whether personalized privacy review reports---summarizing real user feedback for a developer's own app---can effectively nudge them toward planning privacy improvements. We surveyed 42 app developers, presenting them with reports containing privacy themes, temporal trends, peer benchmarks, and emotion distributions derived from their apps' reviews. Our findings indicate that these privacy report interventions proved highly effective, with 76% (32 of 42) of participants finding at least one section of the report useful. Furthermore, exposure to the report increased the participants' intent to pursue privacy-relevant actions -- such as reorganizing the UI, enhancing privacy communications, or adding/removing features -- with 69% (29 of 42) of participants indicating an increased intent to do so. Almost all developers expressed a desire to receive such privacy reports periodically or on demand. These results indicate that making this style of report broadly available across the industry could foster a more privacy-conscious mobile ecosystem. View details
Beyond PII: How Users Perceive and Attempt to Mitigate Implicit LLM Inference
Synthia Wang
Nick Feamster
Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems (CHI), Association for Computing Machinery
Preview abstract Large Language Models (LLMs) such as ChatGPT can infer personal attributes from seemingly innocuous text, raising privacy risks beyond memorized data leakage. While prior work has demonstrated these risks, little is known about how users estimate and respond. We conducted a survey with 240 U.S. participants who judged text snippets for inference risks, reported concern levels, and attempted rewrites to block inference. We compared their rewrites with those generated by ChatGPT and Rescriber, a state-of-the-art sanitization tool. Results show that participants struggled to anticipate inference, performing a little better than chance. User rewrites were effective in just 28% of cases - better than Rescriber but worse than ChatGPT. We examined our participants’ rewriting strategies, and observed that while paraphrasing was the most common strategy it is also the least effective; instead abstraction and adding ambiguity were more successful. Our work highlights the importance of inference-aware design in LLM interactions. View details
Preview abstract Large language models (LLMs) are a class of powerful and versatile models that are beneficial to many industries. With the emergence of LLMs, we take a fresh look at cyber security, specifically exploring and summarizing the potential of LLMs in addressing challenging problems in the security and safety domains. View details
Evaluating Privacy Perceptions, Experience, and Behavior of Software Development Teams
Maxwell Prybylo
Sara Haghighi
Sepideh Ghanavati
Symposium on Usable Privacy and Security (SOUPS), USENIX Association (2024), pp. 101-120
Preview abstract With the increase in the number of privacy regulations, small development teams are forced to make privacy decisions on their own. In this paper, we conduct a mixed-method survey study, including statistical and qualitative analysis, to evaluate the privacy perceptions, practices, and knowledge of members involved in various phases of the Software Development Life Cycle (SDLC). Our survey includes 362 participants from 23 countries, encompassing roles such as product managers, developers, and testers. Our results show diverse definitions of privacy across SDLC roles, emphasizing the need for a holistic privacy approach throughout SDLC. We find that software teams, regardless of their region, are less familiar with privacy concepts (such as anonymization), relying on self-teaching and forums. Most participants are more familiar with GDPR and HIPAA than other regulations, with multi-jurisdictional compliance being their primary concern. Our results advocate the need for role-dependent solutions to address the privacy challenges, and we highlight research directions and educational takeaways to help improve privacy-aware SDLC. View details
A Decade of Privacy-Relevant Android App Reviews: Large Scale Trends
Omer Akgul
Michelle L. Mazurek
Benoit Seguin
33rd USENIX Security Symposium (USENIX Security 24), USENIX Association (2024), pp. 5089-5106
Preview abstract We present an analysis of 12 million instances of privacy-relevant reviews publicly visible on the Google Play Store that span a 10 year period. By leveraging state of the art NLP techniques, we examine what users have been writing about privacy along multiple dimensions: time, countries, app types, diverse privacy topics, and even across a spectrum of emotions. We find consistent growth of privacy-relevant reviews, and explore topics that are trending (such as Data Deletion and Data Theft), as well as those on the decline (such as privacy-relevant reviews on sensitive permissions). We find that although privacy reviews come from more than 200 countries, 33 countries provide 90% of privacy reviews. We conduct a comparison across countries by examining the distribution of privacy topics a country’s users write about, and find that geographic proximity is not a reliable indicator that nearby countries have similar privacy perspectives. We uncover some countries with unique patterns and explore those herein. Surprisingly, we uncover that it is not uncommon for reviews that discuss privacy to be positive (32%); many users express pleasure about privacy features within apps or privacy-focused apps. We also uncover some unexpected behaviors, such as the use of reviews to deliver privacy disclaimers to developers. Finally, we demonstrate the value of analyzing app reviews with our approach as a complement to existing methods for understanding users' perspectives about privacy. View details
Preview abstract In this paper we study users' opinions about the privacy of their mobile health apps. We look at what they write in app reviews in the 'Health & Fitness' category on the Google Play store. We identified 2832 apps in this category (based on 1K minimum installs). Using NLP/LLM analyses, we find that 76% of these apps have at least some privacy reviews. In total this yields over 164,000 reviews about privacy, from over 150 countries and in 25 languages. Our analyses identifies top themes and offers an approximation of how widespread these issues are around the world. We show that the top 4 themes - Data Sharing and Exposure, Permission Requests, Location Tracking and Data Collection - are issues of concern in over 70 countries. Our automatically generated thematic summaries reveal interesting aspects that deserve further research around user suspicions (unneeded data collection), user requests (more fine-grained control over data collection and data access), as well as user behavior (uninstalling apps). View details
×