Publications
Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.
Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.
Sort By
1 - 15 of 10128 publications
FEDAQT: ACCURATE QUANTIZED TRAINING WITH FEDERATED LEARNING
Renkun Ni
Oleg Rybakov
Phoenix Meadowlark
Tom Goldstein
Preview abstract
Federated learning has been widely used to train automatic speech recognition models, where the training procedure is decentralized to client devices to avoid data privacy concerns by keeping the training data locally. However, the limited computation resources on client devices prevent training with large models. Recently, quantization-aware training has shown the potential to train a quantized neural network with similar performance to the full-precision model while keeping the model size small and inference faster. However, these quantization methods will not save memory during training since they still keep the full-precision model. To address this issue, we propose a new quantization training framework for federated learning which saves the memory usage by training with quantized variables directly on local devices. We empirically show that our method can achieve comparable WER while only using 60% memory of the full-precision model.
View details
On the Robustness of Image-based Malware Detection against Adversarial Attacks
Yassine Mekdad
Harun Oz
Ahmet Aris
Leonardo Babun
Faraz Naseem
Selcuk Uluagac
Nasir Ghani
Abbas Acar
Network Security Empowered by Artificial Intelligence, Springer (2024)
Preview abstract
Machine and deep learning models are now one of the most valuable tools in the arsenal of computer security practitioners. Their success has been demonstrated in various network-security-oriented applications such as intrusion detection, cyber threat intelligence, vulnerability discovery, and malware detection. Nevertheless, recent research studies have shown that crafted adversarial samples can be used to evade malware detection models. Even though several defense mechanisms such as adversarial training have been proposed in the malware detection domain to address this issue, they unfortunately suffer from model poisoning and low detection accuracy. In this chapter, we assess the robustness of image-based malware classifier against four different adversarial attacks: (a) random and benign brute-force byte append attacks for black-box settings and (b) random and benign Fast Gradient Sign Method (FGSM) attacks for white-box settings. To this end, we implement a Convolutional Neural Network (CNN) to classify the image representations of Windows Portable Executable (PE) malware with a detection accuracy of 95.05%. Then, we evaluate its robustness along with MalConv, a state-of-the-art malware classifier, by applying a set of functionality-preserving adversarial attacks. Our experimental results demonstrate that image-based classifier exhibits a lower evasion rate of 5% compared to MalConv that achieves an evasion rate ranging between 44 and 54% in black-box settings. However, in white-box settings, both models fail against random byte and benign byte FGSM attacks, with an evasion rate of more than 46%.
View details
MarkovGen: Structured Prediction for Efficient Text-to-Image Generation
Sadeep Jayasumana
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2024)
Preview abstract
Modern text-to-image generation models produce high-quality images that are both photorealistic and faithful to the text prompts. However, this quality comes at significant computational cost: nearly all of these models are iterative and require running sampling multiple times with large models. This iterative process is needed to ensure that different regions of the image are not only aligned with the text prompt, but also compatible with each other. In this work, we propose a light-weight approach to achieving this compatibility between different regions of an image, using a Markov Random Field (MRF) model. We demonstrate the effectiveness of this method on top of the latent token-based Muse text-to-image model. The MRF richly encodes the compatibility among image tokens at different spatial locations to improve quality and significantly reduce the required number of Muse sampling steps. Inference with the MRF is significantly cheaper, and its parameters can be quickly learned through back-propagation by modeling MRF inference as a differentiable neural-network layer. Our full model, MarkovGen, uses this proposed MRF model to both speed up Muse by 1.5X and produce higher quality images by decreasing undesirable image artifacts.
View details
Preview abstract
Motivated by recent advances in large language models for NLP, we design a time-series foundation model for forecasting whose out-of-the-box zero-shot performance on a variety of datasets, matches the accuracy of state-of-the-art supervised forecasting models for each individual dataset. Our model is based on pretraining a patched-decoder style attention model on a large time series dataset, and can work well across different forecasting history lengths, prediction lengths and temporal granularities.
View details
Preview abstract
In this paper, we investigate a problem of \emph{actively} learning threshold in latent space, where the \emph{unknown} reward $g(\gamma, v)$ depends on the proposed threshold $\gamma$ and latent value $v$ and it can be \emph{only} achieved if the threshold is lower than or equal to the \emph{unknown} latent value. This problem has broad applications in practical scenarios, e.g., reserve price optimization in online auctions, online task assignments in crowdsourcing, setting recruiting bars in hiring, etc. We first characterize the query complexity of learning a threshold with the expected reward at most $\eps$ smaller than the optimum and prove that the number of queries needed can be infinitely large even when $g(\gamma, v)$ is monotone with respect to both $\gamma$ and $v$. On the positive side, we provide a tight query complexity $\Tilde{\Theta}(1/\eps^3)$ when $g$ is monotone and the CDF of value distribution is Lipschitz. Moreover, we show a tight $\Tilde{\Theta}(1/\eps^3)$ query complexity can be achieved as long as $g$ satisfies one-sided Lipschitzness, which provides a complete characterization for this problem. Finally, we extend this model to an online learning setting and demonstrate a tight $\Theta(T^{2/3})$ regret bound using continuous-arm bandit techniques and the aforementioned query complexity results.
View details
Preview abstract
Misgendering is the act of referring to someone in way that does not reflect their gender identity. Translation systems, including foundation models capable of translation, can produce errors that result in misgendering harms. To measure the extent of such potential harms when translating into and out of English, we introduce a dataset, MiTTenS, covering 26 languages. The dataset is constructed with handcrafted passages that target known failure patterns, longer synthetically generated passages, and natural passages sourced from multiple domains. We demonstrate the usefulness of the dataset by evaluating both dedicated neural machine translation systems and foundation models, and show that all systems exhibit errors resulting in misgendering harms, even in high resource languages.
View details
General Identifiability and Achievability for Causal Representation Learning
Burak Varici
Emre Acarturk
Ali Tajer
AISTATS 2024 (Oral), Oral Talk at NeurIPS Causal Representation Learning Workshop 2023. (2024)
Preview abstract
This paper focuses on causal representation learning (CRL) under a general nonparametric latent causal model and a general transformation model that maps the latent data to the observational data. It establishes identifiability and achievability results using two hard uncoupled interventions per node in the latent causal graph. Notably, one does
not know which pair of intervention environments have the same node intervened (hence,
uncoupled). For identifiability, the paper establishes that perfect recovery of the latent
causal model and variables is guaranteed under uncoupled interventions. For achievability,
an algorithm is designed that uses observational and interventional data and recovers
the latent causal model and variables with provable guarantees. This algorithm leverages
score variations across different environments to estimate the inverse of the transformer and,
subsequently, the latent variables. The analysis, additionally, recovers the identifiability
result for two hard coupled interventions, that is when metadata about the pair of environments that have the same node intervened is known. This paper also shows that when observational data is available, additional faithfulness assumptions that are adopted by the existing literature are unnecessary
View details
Preview abstract
The article summarizes the unique challenges and strategies required for a successful GTM (Go to market) strategy in enterprise world. We cover how enterprise PM function is unique from regular PM, and why enterprise PMs must look at distribution as an inherent product process. We also share a framework for thinking about various components of GTM strategy. Key aspects include customer segmentation, account acquisition strategies, product packaging, positionining and marketing; and technical enablement and content distribution.
View details
UINav: A Practical Approach to Train On-Device Automation Agents
Wei Li
Fu-Lin Hsu
Will Bishop
Folawiyo Campbell-Ajala
2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024) - Industry Track
Preview abstract
Automation systems that can autonomously drive application user interfaces to complete user tasks are of great benefit, especially when users are situationally or permanently impaired. Prior automation systems do not produce generalizable models while AI-based automation agents work reliably only in simple, hand-crafted applications or incur high computation costs. We propose UINav, a demonstration-based approach to train automation agents that fit mobile devices, yet achieving high success rates with modest numbers of demonstrations. To reduce the demonstration overhead, UINav, uses a referee model that provides users with immediate feedback on tasks where the agent fails, and automatically augments human demonstrations to increase diversity in training data. Our evaluation shows that with only 10 demonstrations UINav, can achieve 70% accuracy, and that with enough demonstrations it can surpass 90% accuracy.
View details
Media Mix Model Calibration With Bayesian Priors
Mike Wurm
Brenda Price
Ying Liu
research.google (2024)
Preview abstract
Effective model calibration is a critical and indispensable component in developing Media Mix Models (MMMs). One advantage of Bayesian-based MMMs lies in their capacity to accommodate the information from experiment results and the modelers' domain knowledge about the ad effectiveness by setting priors for the model parameters. However, it remains ambiguous about how and which Bayesian priors should be tuned for calibration purpose. In this paper, we propose a new calibration method through model reparameterization. The reparameterized model includes Return on Ads Spend (ROAS) as a model parameter, enabling straightforward adjustment of its prior distribution to align with either experiment results or the modeler's prior knowledge. The proposed method also helps address several key challenges regarding combining MMMs and incrementality experiments. We use simulations to demonstrate that our approach can significantly reduce the bias and uncertainty in the resultant posterior ROAS estimates.
View details
LinguaMeta: Unified Metadata for Thousands of Languages
Uche Okonkwo
Emily Drummond
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Preview abstract
We introduce LinguaMeta, a unified resource for language metadata for thousands of languages, including language codes, names, number of speakers, writing systems, countries, official status, coordinates, and language varieties. The resources are drawn from various existing repositories and supplemented with our own research. Each data point is tagged for its origin, allowing us to easily trace back to and improve existing resources with more up-to-date and complete metadata. The resource is intended for use by researchers and organizations who aim to extend technology to thousands of languages.
View details
Android Permissions: Evolution, Attacks, and Best Practices
IEEE Security & Privacy (2024)
Preview abstract
In this article, we study the evolution of Android permissions. We describe the rationale behind key changes in Android’s permission model and disclose two permission-related security vulnerabilities we discovered. Finally, we provide developers actionable insights to proactively address permission-related security and privacy risks during development.
View details
SEMQA: Semi-Extractive Multi-Source Question Answering
Haitian Sun
NAACL (2024) (to appear)
Preview abstract
Recently proposed long-form question answering (QA) systems, supported by large language models (LLMs), have shown promising capabilities. Yet, attributing and verifying their generated abstractive answers can be difficult, and automatically evaluating their accuracy remains an ongoing challenge.
In this paper, we introduce a new QA task for answering multi-answer questions by summarizing multiple diverse sources in a semi-extractive fashion. Specifically, Semi-extractive Multi-source QA (SEMQA) requires models to output a comprehensive answer while mixing between factual quoted spans---copied verbatim from given input sources---and non-factual free-text connectors that glue these spans together into a single cohesive passage. This setting bridges the gap between the outputs of well-grounded but constrained extractive QA systems and more fluent but harder to attribute fully abstractive answers. Particularly, it enables a new mode for language models that leverages their advanced language generation capabilities, while also producing fine in-line attributions by-design that are easy to verify, interpret, and evaluate. To study this task, we create the first dataset of this kind with human-written semi-extractive answers to natural and generated questions, and define text-based evaluation metrics. Experimenting with several LLMs in various settings, we find this task to be surprisingly challenging, demonstrating the importance of our work for developing and studying such consolidation capabilities.
View details
Preview abstract
In the present computerized period, information driven navigation is essential for the progress of cooperative work areas. This paper gives an extensive examination of how information designing, distributed storage, and business insight synergistically engage groups. We look at the basic standards of information designing, zeroing in on the plan, development, and the management of adaptable information pipelines. The job of distributed storage is investigated, featuring its ability to give adaptable, secure, and open information arrangements. Besides, we dive into business knowledge instruments and their capacity to change crude information into significant experiences. Through contextual analyses and exact information, we delineate the groundbreaking effect of these advances in group efficiency, coordinated effort, and dynamic cycles. This examination highlights the significance of incorporating hearty information designing works on, utilizing distributed storage arrangements, and utilizing complex business knowledge apparatuses to establish information engaged cooperative conditions.
View details
Preview abstract
Advances in deep learning systems have allowed large models to match or surpass human accuracy on a number of skills such as image classification, basic programming, and standardized test taking. As the performance of the most capable models begin to saturate on tasks where humans already achieve high accuracy, it becomes necessary to benchmark models on increasingly complex abilities. One such task is forecasting the future outcome of events. In this work we describe experiments using a novel dataset of real world events and associated human predictions, an evaluation metric to measure forecasting ability, and the accuracy of a number of different LLM based forecasting designs on the provided dataset. Additionally, we analyze the performance of the LLM forecasters against human predictions and find that models still struggle to make accurate predictions about the future. Our follow-up experiments indicate this is likely due to models' tendency to guess that most events are unlikely to occur (which tends to be true for many prediction datasets, but does not reflect actual forecasting abilities). We reflect on next steps for developing a systematic and reliable approach to studying LLM forecasting.
View details