
Andreas Terzis
Research Areas
Authored Publications
Sort By
Google’s Approach to Protecting Privacy in the Age of AI
Reiner Critides
Julien Freudiger
Google, , 1600 Amphitheatre Parkway, Mountain View, CA, 94043 (2025)
Preview abstract
AI products introduce new privacy challenges. Finding the right privacy solution is central to developing innovative products, especially as AI models increasingly handle user data. In this paper, we propose a framework to reason about privacy in AI, and discuss how Privacy Enhancing Technologies (PETs) enable novel user experiences by reducing privacy risks in the AI development lifecycle. We argue that privacy protections are not inherently at odds with utility; in contrast, we discuss how building privacy into products from the start can create better, more trustworthy experiences for everyone.
View details
Private prediction for large-scale synthetic text generation
Alex Bie
Natalia Ponomareva
Findings of EMNLP 2024
Preview abstract
We present an approach for generating differentially private synthetic text using large language models (LLMs), via private prediction. In the private prediction framework, we only require the output synthetic data to satisfy differential privacy guarantees. This is in contrast to approaches that train a generative model on potentially sensitive user-supplied source data and seek to ensure the model itself is safe to release.
We prompt a pretrained LLM with source data, but ensure that next-token predictions are made with differential privacy guarantees. Previous work in this paradigm reported generating a small number of examples (<10) at reasonable privacy levels, an amount of data that is useful only for downstream in-context learning or prompting. In contrast, we make changes that allow us to generate thousands of high-quality synthetic data points, greatly expanding the set of potential applications. Our improvements come from an improved privacy analysis and a better private selection mechanism, which makes use of the equivalence between the softmax layer for sampling tokens in LLMs and the exponential mechanism. Furthermore, we introduce a novel use of public predictions via the sparse vector technique, in which we do not pay privacy costs for tokens that are predictable without sensitive data; we find this to be particularly effective for structured data.
View details
An Internet-Wide Analysis of Traffic Policing
Tobias Flach
Luis Pedrosa
Tayeb Karim
Ethan Katz-Bassett
Ramesh Govindan
SIGCOMM (2016)
Preview abstract
Large flows like videos consume significant
bandwidth. Some ISPs actively manage these high volume
flows with techniques like policing, which enforces a flow
rate by dropping excess traffic. While the existence of policing
is well known, our contribution is an Internet-wide study
quantifying its prevalence and impact on video quality metrics.
We developed a heuristic to identify policing from
server-side traces and built a pipeline to deploy it at scale on
hundreds of servers worldwide within one of the largest online
content providers. Using a dataset of 270 billion packets
served to 28,400 client ASes, we find that, depending on region,
up to 7% of lossy transfers are policed. Loss rates are
on average 6× higher when a trace is policed, and it impacts
video playback quality. We show that alternatives to policing,
like pacing and shaping, can achieve traffic management
goals while avoiding the deleterious effects of policing.
View details
CQIC: Revisiting Cross-Layer Congestion Control f or Cellular Networks
Feng Lu
Hao Du
Geoffrey M. Voelker
Alex C. Snoeren
Proceedings of The 16th International Workshop on Mobile Computing Systems and Applications (HotMobile), ACM (2015), pp. 45-50
Preview abstract
With the advent of high-speed cellular access and the overwhelming popularity of smartphones, a large percent of today’s Internet content is being delivered via cellular links. Due to the nature of long-range wireless signal propagation, the capacity of the last hop cellular link can vary by orders of magnitude within a short period of time (e.g., a few seconds). Unfortunately, TCP does not perform well in such fast-changing environments, potentially leading to poor spectrum utilization and high end-to-end packet delay.
In this paper we revisit seminal work in cross-layer optimization the context of 4G cellular networks. Specifically, we leverage the rich physical layer information exchanged between base stations (NodeB) and mobile phones (UE) to predict the capacity of the underlying cellular link, and propose CQIC, a cross-layer congestion control design. Experiments on real cellular networks confirm that our capacity estimation method is both accurate and precise. A CQIC sender uses these capacity estimates to adjust its packet sending behavior. Our preliminary evaluation reveals that CQIC improves throughput over TCP by 1.08–2.89
× for small and medium flows. For large flows, CQIC attains throughput comparable to TCP while reducing the average RTT by 2.38–2.65x.
View details
Reducing Web Latency: the Virtue of Gentle Aggression
Tobias Flach
Barath Raghavan
Shuai Hao
Ethan Katz-Bassett
Ramesh Govindan
Proceedings of the ACM Conference of the Special Interest Group on Data Communication (SIGCOMM '13), ACM (2013)
Preview abstract
To serve users quickly, Web service providers build infrastructure closer to clients and use multi-stage transport connections. Although these changes reduce client-perceived round-trip times, TCP's current mechanisms fundamentally limit latency improvements. We performed a measurement study of a large Web service provider and found that, while connections with no loss complete close to the ideal latency of one round-trip time, TCP's timeout-driven recovery causes transfers with loss to take five times longer on average.
In this paper, we present the design of novel loss recovery mechanisms for TCP that judiciously use redundant transmissions to minimize timeout-driven recovery. Proactive, Reactive, and Corrective are three qualitatively different, easily-deployable mechanisms that (1) proactively recover from losses, (2) recover from them as quickly as possible, and (3) reconstruct packets to mask loss. Crucially, the mechanisms are compatible both with middleboxes and with TCP's existing congestion control and loss recovery. Our large-scale experiments on Google's production network that serves billions of flows demonstrate a 23% decrease in the mean and 47% in 99th percentile latency over today's TCP.
View details
packetdrill: Scriptable Network Stack Testing, from Sockets to Packets
Lawrence Brakmo
Matt Mathis
Barath Raghavan
Hsiao-keng Jerry Chu
Tom Herbert
Proceedings of the USENIX Annual Technical Conference (USENIX ATC 2013), USENIX, 2560 Ninth Street, Suite 215, Berkeley, CA, 94710 USA, pp. 213-218
Preview abstract
Testing today’s increasingly complex network protocol implementations can be a painstaking process. To help meet this challenge, we developed packetdrill, a portable, open-source scripting tool that enables testing the correctness and performance of entire TCP/UDP/IP network stack implementations, from the system call layer to the hardware network interface, for both IPv4 and IPv6. We describe the design and implementation of the tool, and our experiences using it to execute 657 test cases. The tool was instrumental in our development of three new features for Linux TCP—Early Retransmit, Fast Open, and Loss Probes—and allowed us to find and fix 10 bugs in Linux. Our team uses packetdrill in all phases of the development process for the kernel used in one of the world’s largest Linux installations.
View details
RACNet: a high-fidelity data center sensing network
Preview
Chieh-Jan Mike Liang
Jie Liu
Liqian Luo
Feng Zhao
SenSys '09: Proceedings of the 7th ACM Conference on Embedded Networked Sensor Systems, ACM, New York, NY, USA (2009), pp. 15-28
Peeking Through the Cloud
Preview
Fabian Monrose
Niels Provos
6th Conference on Applied Cryptography and Network Security (2008)