Luyang Liu

Luyang Liu

Luyang Liu is a researcher at Google DeepMind working on foundation models, representation learning, and federated learning. More info can be found at my Google Scholar page.
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Augmentations vs Algorithms: What Works in Self-Supervised Learning
    Warren Morningstar
    Alex Bijamov
    Chris Duvarney
    Luke Friedman
    Neha Kalibhat
    Philip Mansfield
    Renan Rojas-Gomez
    Karan Singhal
    Bradley Green
    Sushant Prakash
    Arxiv (2024) (to appear)
    Preview abstract We study the relative effects of data augmentations, pretraining algorithms, and model architectures in Self-Supervised Learning (SSL). While the recent literature in this space leaves the impression that the pretraining algorithm is of critical importance to performance, understanding its effect is complicated by the difficulty in making objective and direct comparisons between methods. We propose a new framework which unifies many seemingly disparate SSL methods into a single shared template. Using this framework, we identify aspects in which methods differ and observe that in addition to changing the pretraining algorithm, many works also use new data augmentations or more powerful model architectures. We compare several popular SSL methods using our framework and find that many algorithmic additions, such as prediction networks or new losses, have a minor impact on downstream task performance (often less than 1%), while enhanced augmentation techniques offer more significant performance improvements (2−4%). Our findings challenge the premise that SSL is being driven primarily by algorithmic improvements, and suggest instead a bitter lesson for SSL: that augmentation diversity and data / model scale are more critical contributors to recent advances in self-supervised learning. View details
    Heterogeneous LoRA for Federated Fine-tuning of On-Device Foundation Models
    Yae Jee Cho
    Aldi Fahrezi
    Gauri Joshi
    The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024) (2024)
    Preview abstract Foundation models (FMs) adapt well to specific domains or tasks with fine-tuning, and federated learning (FL) enables the potential for privacy-preserving fine-tuning of the FMs with on-device local data. For federated fine-tuning of FMs, we consider the FMs with small to medium parameter sizes of single digit billion at maximum, referred to as on-device FMs (ODFMs) that can be deployed on devices for inference but can only be fine-tuned with parameter efficient methods. In our work, we tackle the data and system heterogeneity problem of federated fine-tuning of ODFMs by proposing a novel method using heterogeneous low-rank approximations (LoRAs), namely HetLoRA. First, we show that the naive approach of using homogeneous LoRA ranks across devices face a trade-off between overfitting and slow convergence, and thus propose HetLoRA, which allows heterogeneous ranks across client devices and efficiently aggregates and distributes these heterogeneous LoRA modules. By applying rank self-pruning locally and sparsity-weighted aggregation at the server, HetLoRA combines the advantages of high and low-rank LoRAs, which achieves improved convergence speed and final performance compared to homogeneous LoRA. Furthermore, HetLoRA offers enhanced computation efficiency compared to full fine-tuning, making it suitable for federated fine-tuning across heterogeneous devices. View details
    USER-LLM: Efficient LLM Contextualization with User Embedding
    Jiaxing Wu
    Neo Wu
    Devora Berlowitz
    Sushant Prakash
    Bradley Green
    Shawn O'Banion
    Jun Xie
    ArXiv (2024) (to appear)
    Preview abstract Large language models (LLMs) have revolutionized natural language processing. However, effectively incorporating complex and potentially noisy user interaction data remains a challenge. To address this, we propose User-LLM, a novel framework that leverages user embeddings to contextualize LLMs. These embeddings, distilled from diverse user interactions using self-supervised pretraining, capture latent user preferences and their evolution over time. We integrate these user embeddings with LLMs through cross-attention and soft-prompting, enabling LLMs to dynamically adapt to user context. Our comprehensive experiments on MovieLens, Amazon Review, and Google Local Review datasets demonstrate significant performance gains across various tasks. Notably, our approach outperforms text-prompt-based contextualization on long sequence tasks and tasks that require deep user understanding while being computationally efficient. We further incorporate Perceiver layers to streamline the integration between user encoders and LLMs, reducing computational demands. View details
    Preview abstract Self-Supervised Learning (SSL) enables training performant models using limited labeled data. One of the pillars underlying vision SSL is the use of data augmentations/perturbations of the input which do not significantly alter its semantic content. For audio and other temporal signals, augmentations are commonly used alongside format transforms such as Fourier transforms or wavelet transforms. Unlike augmentations, format transforms do not change the information contained in the data; rather, they express the same information in different coordinates. In this paper, we study the effects of format transforms and augmentations both separately and together on vision SSL. We define augmentations in frequency space called Fourier Domain Augmentations (FDA) and show that training SSL models on a combination of these and image augmentations can improve the downstream classification accuracy by up to 1.3% on ImageNet-1K. We also show improvements against SSL baselines in few-shot and transfer learning setups using FDA. Surprisingly, we also observe that format transforms can improve the quality of learned representations even without augmentations; however, the combination of the two techniques yields better quality. View details
    Auditing Privacy Defenses in Federated Learning via Generative Gradient Leakage
    Zhuohang Li
    Jiaxin Zhang
    Jian Liu
    The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR 2022) (2022)
    Preview abstract Federated Learning (FL) framework brings privacy benefits to distributed learning systems by allowing multiple clients to participate in a learning task under the coordination of a central server without exchanging their private data. However, recent studies have revealed that private information can still be leaked through shared gradient information. To further protect user's privacy, several defense mechanisms have been proposed to prevent privacy leakage via gradient information degradation methods, such as using additive noise or gradient compression before sharing it with the server. In this work, we validate that the private training data can still be leaked under certain defense settings with a new type of leakage, i.e., Generative Gradient Leakage (GGL). Unlike existing methods that only rely on gradient information to reconstruct data, our method leverages the latent space of generative adversarial networks (GAN) learned from public image datasets as a prior to compensate for the informational loss during gradient degradation. To address the nonlinearity caused by the gradient operator and the GAN model, we explore various gradient-free optimization methods (e.g., evolution strategies and Bayesian optimization) and empirically show their superiority in reconstructing high-quality images from gradients compared to gradient-based optimizers. We hope the proposed method can serve as a tool for empirically measuring the amount of privacy leakage to facilitate the design of more robust defense mechanisms. View details
    FedRolex: Model-Heterogeneous Federated Learning with Rolling Sub-Model Extraction
    Samiul Alam
    Ming Yan
    Mi Zhang
    Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS 2022) (2022)
    Preview abstract Most cross-device federated learning (FL) studies focus on the model-homogeneous setting where the global server model and local client models are identical. However, such constraint not only excludes low-end clients who would otherwise make unique contributions to model training but also restrains clients from training large models due to on-device resource bottlenecks. In this work, we propose FedRolex, a partial training (PT)-based approach that enables model-heterogeneous FL and can train a global server model larger than the largest client model. At its core, FedRolex employs a rolling sub-model extraction scheme that allows different parts of the global server model to be evenly trained, which mitigates the client drift induced by the inconsistency between individual client models and server model architectures. We show that FedRolex outperforms state-of-the-art PT-based model-heterogeneous FL methods (e.g. Federated Dropout) and reduces the gap between model-heterogeneous and model-homogeneous FL, especially under the large-model large-dataset regime. In addition, we provide theoretical statistical analysis on its advantage over Federated Dropout and evaluate FedRolex on an emulated real-world device distribution to show that FedRolex can enhance the inclusiveness of FL and boost the performance of low-end devices that would otherwise not benefit from FL. View details
    Smartphone-based Hard Braking Events Detection at Scale for Road Safety Services
    David Racz
    Julie Michelman
    Stefan Mellem
    Paul C. Eastham
    Bradley Green
    Charles Robert Armstrong
    Shawn O'Banion
    Feng Guo
    Transportation Research Part C: Emerging Technologies (2022)
    Preview abstract Road crashes are the sixth leading cause of lost disability-adjusted life-years (DALYs) worldwide. One major challenge in traffic safety research is the sparsity of crashes, which makes it difficult to achieve a fine-grain understanding of crash causations and predict future crash risk in a timely manner. Hard-braking events have been widely used as a safety surrogate due to their relatively high prevalence and ease of detection with embedded vehicle sensors. As an alternative to using sensors fixed in vehicles, this paper presents a scalable approach for detecting hard-braking events using the kinematics data collected from smartphone sensors. We train a Transformer-based machine learning model for hard-braking event detection using concurrent sensor readings from smartphones and vehicle sensors from drivers who connect their phone to the vehicle while navigating in Google Maps. The detection model shows superior performance with a 0.83 Area under the Precision-Recall Curve (PR-AUC), which is 3.8×better than a GPS speed-based heuristic model, and 166.6×better than an accelerometer-based heuristic model. The detected hard-braking events are strongly correlated with crashes from publicly available datasets, supporting their use as a safety surrogate. In addition, we conduct model fairness and selection bias evaluation to ensure that the safety benefits are equally shared. The developed methodology can benefit many safety applications such as identifying safety hot spots at road network level, evaluating the safety of new user interfaces, as well as using routing to improve traffic safety. View details
    Modeling the effect of exposure notification and non-pharmaceutical interventions on COVID-19 transmission in Washington state
    Matthew Abueg
    Robert Hinch
    Neo Wu
    William Probert
    Austin Wu
    Paul Eastham
    Yusef Shafi
    Matt Rosencrantz
    Zhao Cheng
    Anel Nurtay
    Lucie Abeler-Dörner
    David Bonsall
    Michael V. McConnell
    Shawn O'Banion
    Christophe Fraser
    npj Digital Medicine (2021)
    Preview abstract Contact tracing is increasingly used to combat COVID-19, and digital implementations are now being deployed, many based on Apple and Google’s Exposure Notification System. These systems utilize non-traditional smartphone-based technology, presenting challenges in understanding possible outcomes. In this work, we create individual-based models of three Washington state counties to explore how digital exposure notifications combined with other non-pharmaceutical interventions influence COVID-19 disease spread under various adoption, compliance, and mobility scenarios. In a model with 15% participation, we found that exposure notification could reduce infections and deaths by approximately 8% and 6% and could effectively complement traditional contact tracing. We believe this can provide health authorities in Washington state and beyond with guidance on how exposure notification can complement traditional interventions to suppress the spread of COVID-19. View details
    Elf: Accelerate High-resolution Mobile Deep Vision with Content-aware Parallel Offloading
    Wuyang Zhang
    Zhezhi He
    Zhenhua Jia
    Yunxin Liu
    Marco Gruteser
    Dipankar Raychaudhuri
    Yanyong Zhang
    The 27th Annual International Conference on Mobile Computing and Networking (ACM MobiCom 2021). (2021)
    Preview abstract A broad class of computer vision algorithms on images or videos collected by mobile devices greatly benefit from deep learning for high application performance. Meanwhile, these applications often demand real-time responses (e.g., <100ms), which can hardly be satisfied with mobile devices of limited computation capability. Offloading the computation from mobile devices to edge clouds has been recently proposed as a promising approach. However, the previous work assumes that there always exist dedicated and powerful edge servers that put all the computing resources for a single offloading job. This assumption can hardly be true consistently due to the distributed nature of edge cloud and dynamic resource needs from mobile users. In this work, we propose and design a system called \textit{Elf} to accelerate the deep neural network inference for vision application running on mobile devices. Elf is customized to minimize the end-to-end latency, through the intelligent content partition and multi-edge server offloading. In particular, instead of offloading an entire high-resolution video-clips to a single edge server naively, we perform intelligent partitions on the high-resolution video-clips in a content-/resource-aware fashion. Such partition leverages various techniques, including Region-Proposal (RP) complexity estimation, RP location prediction and Low-Resolution Compensation (LRC), which dynamically assigns the partitions to different computation servers. Comprehensive experiments are performed to demonstrate that the Elf system can effectively reduce the end-to-end latency of multi-object segmentation tasks on DAVIS2017 by 88.5\%, 94.6\% and 33.8\%, in comparison to NVIDIA Jetson TX2, Nano and single object counterparts. View details
    Preview abstract Federated Learning (FL) enables multiple distributed clients (e.g., mobile devices) to collaboratively train a centralized model while keeping the training data locally on the clients’ devices. Compared to traditional centralized machine learning, FL offers many favorable features such as offloading operations which would usually be performed by a central server and reducing risks of serious privacy leakage. However, Byzantine clients that send incorrect or disruptive updates due to system failures or adversarial attacks may disturb the joint learning process, consequently degrading the performance of the resulting model. In this paper, we propose to mitigate these failures and attacks from a spatial-temporal perspective. Specifically, we use a clustering-based method to detect and exclude incorrect updates by leveraging their geometric properties in the parameter space. Moreover, to further handle malicious clients with time-varying behaviors, we propose to adaptively adjust the learning rate according to momentum-based update speculation. Extensive experiments on 4 public datasets demonstrate that our algorithm achieves enhanced robustness comparing to existing methods under both cross-silo and cross-device FL settings with faulty/malicious clients. View details