Publications

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

people standing in front of a screen with images and a chipboard

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
1 - 15 of 10133 publications
    Partially Interpretable Models with Guarantees on Coverage and Accuracy
    Nave Frost
    Zachary Lipton
    Michal Moshkovitz
    Algorithmic Learning Theory (ALT) (2024)
    Preview abstract Simple, sufficient explanations furnished by short decision lists can be useful for guiding stakeholder actions. Unfortunately, this transparency can come at the expense of the higher accuracy enjoyed by black box methods, like deep nets. To date, practitioners typically either (i) insist on the simpler model, forsaking accuracy; or (ii) insist on maximizing accuracy, settling for post-hoc explanations of dubious faithfulness. In this paper, we propose a hybrid \emph{partially interpretable model} that represents a compromise between the two extremes. In our setup, each input is first processed by a decision list that can either execute a decision or abstain, handing off authority to the opaque model. The key to optimizing the decision list is to optimally trade off the accuracy of the composite system against coverage (the fraction of the population that receives explanations). We contribute a new principled algorithm for constructing partially interpretable decision lists, providing theoretical guarantees addressing both interpretability and accuracy. As an instance of our result, we prove that when the optimal decision list has length $k$, coverage $c$, and $b$ mistakes, our algorithm will generate a decision list that has length no greater than $4k$, coverage at least $c/2$, and makes at most $4b$ mistakes. Finally, we empirically validate the effectiveness of the new model. View details
    Augmented Object Intelligence with XR-Objects
    Mustafa Doga Dogan
    Karan Ahuja
    Andrea Colaco
    Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology (UIST), ACM (2024), pp. 1-15
    Preview abstract Seamless integration of physical objects as interactive digital entities remains a challenge for spatial computing. This paper explores Augmented Object Intelligence (AOI) in the context of XR, an interaction paradigm that aims to blur the lines between digital and physical by equipping real-world objects with the ability to interact as if they were digital, where every object has the potential to serve as a portal to digital functionalities. Our approach utilizes real-time object segmentation and classification, combined with the power of Multimodal Large Language Models (MLLMs), to facilitate these interactions without the need for object pre-registration. We implement the AOI concept in the form of XR-Objects, an open-source prototype system that provides a platform for users to engage with their physical environment in contextually relevant ways using object-based context menus. This system enables analog objects to not only convey information but also to initiate digital actions, such as querying for details or executing tasks. Our contributions are threefold: (1) we define the AOI concept and detail its advantages over traditional AI assistants, (2) detail the XR-Objects system’s open-source design and implementation, and (3) show its versatility through various use cases and a user study. View details
    With Great Power Comes Great Responsibility: Security and Privacy Issues of Modern Browser APIs
    Harun Oz
    Daniele Cono D’Elia
    Abbas Acar
    Riccardo Lazzeretti
    Selcuk Uluagac
    IEEE Security and Privacy (2024)
    Preview abstract This paper discusses security and privacy issues in modern Browser APIs by categorizing them based on their functionality. With this study, we aim to alert the community about these issues and motivate further research into analyzing the security and privacy concerns within modern Browser APIs. View details
    Preview abstract Neglected tropical diseases (NTDs) and infectious diseases disproportionately affect the poorest regions of the world. While large language models (LLMs) have shown promise for medical question answering, there is limited work focused on tropical and infectious disease-specific explorations. We introduce TRINDs, a dataset of 52 tropical and infectious diseases with demographic and semantic clinical and consumer augmentations. We evaluate various context and counterfactual locations to understand their influence on LLM performance. Results show that LLMs perform best when provided with contextual information such as demographics, location, and symptoms. We also develop TRINDs-LM, a tool that enables users to enter symptoms and contextual information to receive a most likely diagnosis. In addition to the LLM evaluations, we also conducted a human expert baseline study to assess the accuracy of human experts in diagnosing tropical and infectious diseases with 7 medical and public health experts. This work demonstrates methods for creating and evaluating datasets for testing and optimizing LLMs, and the use of a tool that could improve digital diagnosis and tracking of NTDs. View details
    Preview abstract In the present computerized period, information driven navigation is essential for the progress of cooperative work areas. This paper gives an extensive examination of how information designing, distributed storage, and business insight synergistically engage groups. We look at the basic standards of information designing, zeroing in on the plan, development, and the management of adaptable information pipelines. The job of distributed storage is investigated, featuring its ability to give adaptable, secure, and open information arrangements. Besides, we dive into business knowledge instruments and their capacity to change crude information into significant experiences. Through contextual analyses and exact information, we delineate the groundbreaking effect of these advances in group efficiency, coordinated effort, and dynamic cycles. This examination highlights the significance of incorporating hearty information designing works on, utilizing distributed storage arrangements, and utilizing complex business knowledge apparatuses to establish information engaged cooperative conditions. View details
    GASS: GPU Automated Sharing at Scale
    Dragos Sbirlea
    Jiafan Zhu
    Konstantinos Menychtas
    Yuang Liu
    Zhijing Gene Qin
    The IEEE International Conference on Cloud Computing (CLOUD) 2024 (2024)
    Preview abstract General-purpose GPUs, with their powerful numerical computing capacity, are popular platforms for accelerating machine-learning workloads. However, our experience with a large scale production deployment shows that typical GPU work-loads often fail to keep the GPU pipeline fully occupied, resulting in low overall resource utilization. To address this inefficiency, we have designed and implemented GPU Automated Sharing at Scale (GASS). GASS relies on fine-grained time-multiplexing to let GPU compute resources be shared among different tasks, and on-demand paging to let GPU memory be shared among them. GASS mitigates sharing performance anomalies by using real-time performance monitoring to drive adaptive rescheduling. Our cluster level evaluation shows the aggregated GPU throughput is increased by 50% under GASS and that sharing enables the cluster to support 19% more GPU jobs. View details
    Comparative analysis of genAI features in Business Intelligence Platforms
    Aqsa Fulara
    International Journal of Computer Trends and Technology, Volume 72 Issue 4, 95-101, April 2024 (2024)
    Preview abstract The study is a comparative analysis of generative AI capabilities and their applications in BI plaforms. The rapid advancement here has opened new frontiers for data driven decision making and insights generation. However, integration in BI tools is largely unexplored in academia. The findings reveal significant variations in approach taken by different BI tools for similar genAI tasks. View details
    Preview abstract Slow concept drift is a ubiquitous, yet under-studied problem in practical machine learning systems. Although recent data is more indicative of future data in these settings, naively prioritizing these instances runs the risk of losing valuable information from the past. We propose an optimization-driven approach towards balancing instance importance over large training windows. First, we model instance relevance using a mixture of multiple timescales of decay, allowing us to capture rich temporal trends. Second, we learn an auxiliary \textit{scorer model} that recovers the appropriate mixture of timescales as a function of the instance itself. Finally, we propose a nested optimization objective for learning the scorer, by which it maximizes forward transfer for the learned model. Experiments on a large real-world dataset of 39M photos over a 9 year period show upto 15\% relative gains in accuracy compared to other robust learning baselines. We replicate our gains on two collections of real-world datasets for non-stationary learning, and extend our work to continual learning settings where, too, we beat SOTA methods by large margins. View details
    Preview abstract Predictive uncertainty-a model's self awareness regarding its accuracy on an input-is key for both building robust models via training interventions and for test-time applications such as selective classification. We propose a novel instance-conditioned reweighting approach that captures predictive uncertainty using an auxiliary network and unifies these train- and test-time applications. The auxiliary network is trained using a meta-objective in a bilevel optimization framework. A key contribution of our proposal is the meta-objective of minimizing the dropout variance, an approximation of Bayesian Predictive uncertainty. We show in controlled experiments that we effectively capture the diverse specific notions of uncertainty through this meta-objective, while previous approaches only capture certain aspects. These results translate to significant gains in real-world settings-selective classification, label noise, domain adaptation, calibration-and across datasets-Imagenet, Cifar100, diabetic retinopathy, Camelyon, WILDs, Imagenet-C,-A,-R, Clothing1M, etc. For Diabetic Retinopathy, we see upto 3.4%/3.3% accuracy and AUC gains over SOTA in selective classification. We also improve upon large-scale pretrained models such as PLEX. View details
    Preview abstract Private Everlasting Prediction (PEP), recently introduced by Naor et al. [2023], is a model for differentially private learning in which the learner never publicly releases a hypothesis. Instead, it provides a black-box access to a ``prediction oracle'' that can predict the labels of an endless stream of unlabeled examples drawn from the underlying distribution. Importantly, PEP provides privacy both for the initial training set and for the endless stream of classification queries. We present two conceptual modifications to the definition of PEP, as well as new constructions exhibiting significant improvements over prior work. Specifically, our contributions include: (1) Robustness: PEP only guarantees accuracy provided that all the classification queries are drawn from the correct underlying distribution. A few out-of-distribution queries might break the validity of the prediction oracle for future queries, even for future queries which are sampled from the correct distribution. We incorporate robustness against such poisoning attacks into the definition of PEP, and show how to obtain it. (2) Dependence of the privacy parameter delta in the time horizon: We present a relaxed privacy definition, suitable for PEP, that allows us to disconnect the privacy parameter delta from the number of total time steps T. This allows us to obtain algorithms for PEP whose sample complexity is independent from T, thereby making them "truly everlasting". This is in contrast to prior work where the sample complexity grows with polylog(T). (3) New constructions: Prior constructions for PEP exhibit sample complexity that is quadratic in the VC dimension of the target class. We present new constructions of PEP for axis-aligned rectangles and for decision-stumps, that exhibit sample complexity linear in the dimension (instead of quadratic). We show that our constructions satisfy very strong robustness properties. View details
    Connecting Language Technologies with Rich, Diverse Data Sources Covering Thousands of Languages
    Sebastian Ruder
    Julia Kreutzer
    Clara Rivera
    Ishank Saxena
    Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
    Preview abstract Contrary to common belief, there are rich and diverse data sources available for many thousands of languages, which can be used to develop technologies for these languages. In this paper, we provide an overview of some of the major online data sources, the types of data that they provide access to, potential applications of this data, and the number of languages that they cover. Even this covers only a small fraction of the data that exists; for example, printed books are published in many languages but few online aggregators exist. View details
    Experiencing Thing2Reality: Transforming 2D Content into Conditioned Multiviews and 3D Gaussian Objects for XR Communication
    Erzhen Hu
    Mingyi Li
    Seongkook Heo
    Adjunct Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology, ACM (2024)
    Preview abstract During remote communication, participants share both digital and physical content, such as product designs, digital assets, and environments, to enhance mutual understanding. Recent advances in augmented communication have facilitated users to swiftly create and share digital 2D copies of physical objects from video feeds into a shared space. However, the conventional 2D representation of digital objects restricts users’ ability to spatially reference items in a shared immersive environment. To address these challenges, we propose Thing2Reality, an Extended Reality (XR) communication platform designed to enhance spontaneous discussions regard-ing both digital and physical items during remote sessions. WithThing2Reality, users can quickly materialize ideas or physical objects in immersive environments and share them as conditioned multiview renderings or 3D Gaussians. Our system enables users to interact with remote objects or discuss concepts in a collaborative manner. View details
    Preview abstract Motivated by the necessity of guiding and monitoring students' progress in real-time when assembling circuits during in-class activities we propose BlinkBoard, an augmented breadboard to enhance offline as well as online physical computing classes. BlinkBoard uses LEDs placed on each row of the breadboard to guide, via four blinking patterns, how to place and connect components and wires. It also uses a set of Input/Output pins to sense voltage levels at user-specified rows or to generate voltage output. Our hardware uses an open JSON protocol of commands and responses that can be integrated with a graphical application hosted on a computer that ensures bidirectional communication between each of the students' BreadBoard and the instructor's dashboard and slides. The hardware is affordable and simple, partially due to a customized circuit configured via a hardware description language that handles the LEDs' patterns with minimal load on the Arduino micro-controller. Finally, we briefly show how this hardware made its way to a workshop with high-school students and an undergraduate class in a design department. View details
    PikeLPN: Mitigating Overlooked Inefficiencies of Low-Precision Neural Networks
    Marina Neseem
    Conor McCullough
    Randy Hsin
    Chas Leichner
    Shan Li
    In Suk Chong
    Andrew Howard
    Lukasz Lew
    Sherief Reda
    Ville-Mikko Rautio
    Daniele Moro
    Conference on Computer Vision and Pattern Recognition (2024) (to appear)
    Preview abstract Low-precision quantization is recognized for its efficacy in neural network optimization. Our analysis reveals that non-quantized elementwise operations which are prevalent in layers such as parameterized activation functions, batch normalization, and quantization scaling dominate the inference cost of low-precision models. These non-quantized elementwise operations are commonly overlooked in SOTA efficiency metrics such as Arithmetic Computation Effort (ACE). In this paper, we propose ACEv2 - an extended version of ACE which offers a better alignment with the inference cost of quantized models and their energy consumption on ML hardware. Moreover, we introduce PikeLPN, a model that addresses these efficiency issues by applying quantization to both elementwise operations and multiply-accumulate operations. In particular, we present a novel quantization technique for batch normalization layers named QuantNorm which allows for quantizing the batch normalization parameters without compromising the model performance. Additionally, we propose applying Double Quantization where the quantization scaling parameters are quantized. Furthermore, we recognize and resolve the issue of distribution mismatch in Separable Convolution layers by introducing Distribution-Heterogeneous Quantization which enables quantizing them to low-precision. PikeLPN achieves Pareto-optimality in efficiency-accuracy trade-off with up to 3X efficiency improvement compared to SOTA low-precision models. View details
    Beyond SOT: Tracking Multiple Generic Objects at Once
    Christoph Mayer
    Martin Danelljan
    Vittorio Ferrari
    Luc Van Gool
    WACV'24 (2024)
    Preview abstract Generic Object Tracking (GOT) is the problem of tracking target objects, specified by bounding boxes in the first frame of a video. While the task has received much attention in the last decades, researchers have almost exclusively focused on the single object setting. However multiobject GOT poses its own challenges and is more attractive in real-world applications. We attribute the lack of research interest into this problem to the absence of suitable benchmarks. In this work, we introduce a new largescale GOT benchmark, LaGOT, containing multiple annotated target objects per sequence. Our benchmark allows users to tackle key remaining challenges in GOT, aiming to increase robustness and reduce computation through joint tracking of multiple objects simultaneously. In addition, we propose a transformer-based GOT tracker baseline capable of joint processing of multiple objects through shared computation. Our approach achieves a 4× faster run-time in case of 10 concurrent objects compared to tracking each object independently and outperforms existing single object trackers on our new benchmark. In addition, our approach achieves highly competitive results on single-object GOT datasets, setting a new state of the art on TrackingNet with a success rate AUC of 84.4%. Our benchmark, code, results and trained models are available at https://github.com/visionml/pytracking. View details