Jump to Content
Gal Elidan

Gal Elidan

Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
    Paul Roit
    Johan Ferret
    Lior Shani
    Geoffrey Cideron
    Matthieu Geist
    Sertan Girgin
    Léonard Hussenot
    Nikola Momchev
    Piotr Stanczyk
    Nino Vieillard
    Olivier Pietquin
    Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics (2023), 6252–6272
    Preview abstract Despite the seeming success of contemporary grounded text generation systems, they often tend to generate factually inconsistent text with respect to their input. This phenomenon is emphasized in tasks like summarization, in which the generated summaries should be corroborated by their source article. In this work we leverage recent progress on textual entailment models to directly address this problem for abstractive summarization systems. We use reinforcement learning with reference-free, textual-entailment rewards to optimize for factual consistency and explore the ensuing trade-offs, as improved consistency may come at the cost of less informative or more extractive summaries. Our results, according to both automatic metrics and human evaluation, show that our method considerably improves the faithfulness, salience and conciseness of the generated summaries. View details
    Preview abstract In educational dialogue settings students often provide answers that are incomplete. In other words, there is a gap between the answer the student provides and the perfect answer expected by the teacher. Successful dialogue hinges on the teacher asking about this gap in an effective manner, thus creating a rich and interactive educational experience. Here we focus on the problem of generating such gap-focused questions (GFQ) automatically. We define the task, highlight key desired aspects of a good GFQ, and propose a model that satisfies these. Finally, we provide an evaluation of our generated questions and compare them to manually generated ones, demonstrating competitive performance. View details
    A Neural Encoder for Earthquake Rate Forecasting
    Oleg Zlydenko
    Brendan Meade
    Alexandra Sharon Molchanov
    Sella Nevo
    Yohai bar Sinai
    Scientific Reports (2023)
    Preview abstract Forecasting the timing of earthquakes is a long-standing challenge. Moreover, it is still debated how to formulate this problem in a useful manner, or to compare the predictive power of different models. Here, we develop a versatile neural encoder of earthquake catalogs, and apply it to the fundamental problem of earthquake rate prediction, in the spatio-temporal point process framework. The epidemic type aftershock sequence model (ETAS) effectively learns a small number of parameters to constrain assumed functional forms for the space and time relationships of earthquake sequences (e.g., Omori-Utsu law). Here we introduce learned spatial and temporal embeddings for point process earthquake forecast models that capture complex correlation structures. We demonstrate the generality of this neural representation as compared with ETAS model using train-test data splits and how it enables the incorporation of additional geophysical information. In rate prediction tasks, the generalized model shows > 4% improvement in information gain per earthquake and the simultaneous learning of anisotropic spatial structures analogous to fault traces. The trained network can be also used to perform short-term prediction tasks, showing similar improvement while providing a 1,000-fold reduction in run-time. View details
    Active Learning with Label Comparisons
    Shay Moran
    Amir Globerson
    Uncertainty in Artificial Intelligence (submitted) (2022)
    Preview abstract Supervised learning typically relies on manual annotation of the true labels. However, when there are many potential labels, it will be time consuming for a human annotator to search these for the best one. On the other hand, comparing two candidate labels is often much easier. In this paper, we focus on this type of pairwise supervision, and ask how it can be used effectively in learning, and in particular active learning. We obtain several surprising results in this context. In principle, finding the best label out of $k$ can be done with $k-1$ active queries. However, we show that there is a natural class where this approach is in fact sub-optimal, and that there is a more comparison-efficient active learning scheme. A key element in our analysis is the ``label neighborhood graph'' of the true distribution, which has an edge between two classes if they share a decision boundary. We also show that in the PAC setting, pairwise comparisons cannot provide improved sample complexity in the worst case. We complement our theoretical results with experiments, clearly demonstrating the effect of the neighborhood graph on sample complexity. View details
    Flood forecasting with machine learning models in an operational framework
    Asher Metzger
    Chen Barshai
    Dana Weitzner
    Frederik Kratzert
    Gregory Begelman
    Guy Shalev
    Hila Noga
    Moriah Royz
    Niv Giladi
    Ronnie Maor
    Sella Nevo
    Yotam Gigi
    HESS (2022)
    Preview abstract Google’s operational flood forecasting system was developed to provide accurate real-time flood warnings to agencies and the public, with a focus on riverine floods in large, gauged rivers. It became operational in 2018 and has since expanded geographically. This forecasting system consists of four subsystems: data validation, stage forecasting, inundation modeling, and alert distribution. Machine learning is used for two of the subsystems. Stage forecasting is modeled with the Long Short-Term Memory (LSTM) networks and the Linear models. Flood inundation is computed with the Thresholding and the Manifold models, where the former computes inundation extent and the latter computes both inundation extent and depth. The Manifold model, presented here for the first time, provides a machine-learning alternative to hydraulic modeling of flood inundation. When evaluated on historical data, all models achieve sufficiently high-performance metrics for operational use. The LSTM showed higher skills than the Linear model, while the Thresholding and Manifold models achieved similar performance metrics for modeling inundation extent. During the 2021 monsoon season, the flood warning system was operational in India and Bangladesh, covering flood-prone regions around rivers with a total area of 287,000 km2, home to more than 350M people. More than 100M flood alerts were sent to affected populations, to relevant authorities, and to emergency organizations. Current and future work on the system includes extending coverage to additional flood-prone locations, as well as improving modeling capabilities and accuracy. View details
    Explaining in Style: Training a GAN to explain a classifier in StyleSpace
    Yossi Gandelsman
    Yoav Itzhak Wald
    Bill Freeman
    Phillip Isola
    Amir Globerson
    Michal Irani
    Proc. ICCV 2021
    Preview abstract Image classification models can depend on multiple different semantic attributes of the image. An explanation of the decision of the classifier needs to both discover and visualize these properties. Here we present StylEx, a method for doing this, by training a generative model to specifically explain multiple attributes that underlie classifier decisions. A natural source for such attributes is the S-space of StyleGAN, which is known to generate semantically meaningful dimensions in the image. However, these will typically not correspond to classifier-specific attributes since standard GAN training is not dependent on the classifier. To overcome this, we propose training procedure for a StyleGAN, which incorporates the classifier model. This results in an S-space that captures distinct attributes underlying classifier outputs. After training, the model can be used to visualize the effect of changing multiple attributes per image, thus providing an image-specific explanation. We apply StylEx to multiple domains, including animals, leaves, faces and retinal images. For these, we show how an image can be changed in different ways to change its classifier prediction. Our results show that the method finds attributes that align well with semantic ones, generate meaningful image-specific explanations, and are interpretable as measured in user-studies. View details
    Preview abstract We study conversational domain exploration (CODEX), where the user’s goal is to enrich her knowledge of a given domain by conversing with an informative bot. Such conversations should be well grounded in high-quality domain knowledge as well as engaging and open-ended. A CODEX bot should be proactive and introduce relevant information even if not directly asked for by the user. The bot should also appropriately pivot the conversation to undiscovered regions of the domain. To address these dialogue characteristics, we introduce a novel approach termed dynamic composition that decouples candidate content generation from the flexible composition of bot responses. This allows the bot to control the source, correctness and quality of the offered content, while achieving flexibility via a dialogue manager that selects the most appropriate contents in a compositional manner. We implemented a CODEX bot based on dynamic composition and integrated it into the Google Assistant. As an example domain, the bot conversed about the NBA basketball league in a seamless experience, such that users were not aware whether they were conversing with the vanilla system or the one augmented with our CODEX bot. Results are positive and offer insights into what makes for a good conversation. To the best of our knowledge, this is the first real user experiment of open-ended dialogues as part of a commercial assistant system. View details
    HydroNets: Leveraging River Structure for Hydrologic Modeling
    Zach Moshe
    Asher Feivel Metzger
    Frederik Kratzert
    Sella Nevo
    Ran El-Yaniv
    ICLR 2020, Workshop on AI for Earth Sciences (to appear)
    Preview abstract Accurate and scalable hydrologic models are essential building blocks of several important applications, from water resource management to timely flood warnings. However, as the climate changes, precipitation and rainfall-runoff pattern variations become more extreme, and accurate training data that can account for the resulting distributional shifts become more scarce. In this work we present a novel family of hydrologic models, called HydroNets, which leverages river network structure. HydroNets are deep neural network models designed to exploit both basin specific rainfall-runoff signals, and upstream network dynamics, which can lead to improved predictions at longer horizons. The injection of the river structure prior knowledge reduces sample complexity and allows for scalable and more accurate hydrologic modeling even with only a few years of data. We present an empirical study over two large basins in India that convincingly support the proposed model and its advantages. View details
    Spectral Algorithm for Shared Low-rank Matrix Regressions
    Yotam Gigi
    Sella Nevo
    Ami Wiesel
    2020 IEEE 11th Sensor Array and Multichannel Signal Processing Workshop (SAM) (2020)
    Preview abstract We consider multiple matrix regression tasks that share common weights in order to reduce sample complexity. For this purpose, we introduce the common mechanism regression model which assumes a shared right low-rank component across all tasks, but allows an individual per-task left low-rank component. We provide a closed form spectral algorithm for recovering the common component and derive a bound on its error as a function of the number of related tasks and the number of samples available for each of them. Both the algorithm and its analysis are natural extensions of known results in the context of phase retrieval and low rank reconstruction. We demonstrate the efficacy of our approach for the challenging task of remote river discharge estimation across multiple river sites, where data for each task is naturally scarce. In this scenario sharing a low-rank component between the tasks translates to a shared spectral reflection of the water, which is a true underlying physical model. We also show the benefit of the approach in the setting of image classification where the common component can be interpreted as the shared convolution filters. View details
    Preview abstract Floods are among the most common and deadly natural disasters in the world, and flood warning systems have been shown to be effective in reducing harm. Yet the majority of the world's vulnerable population does not have access to reliable and actionable warning systems, due to core challenges in scalability, computational costs, and data availability. In this paper we present two components of flood forecasting systems which were developed over the past year, providing access to these critical systems to 75 million people who didn't have this access before. View details
    Preview abstract Complex classifiers may exhibit ``embarassing'' failures in cases that would be easily classified and justified by a human. Avoiding such failures is obviously paramount, particularly in domains where we cannot accept such unexplained behavior. In this work we focus on one such setting, where a label is perfectly predictable if the input contains certain features, and otherwise, it is predictable by a linear classifier. We define a related hypothesis class and determine its sample complexity. We also give evidence that efficient algorithms cannot, unfortunately, enjoy this sample complexity. We then derive a simple and efficient algorithm, and also give evidence that its sample complexity is optimal, among efficient algorithms. Experiments on sentiment analysis demonstrate the efficacy of the method, both in terms of accuracy and interpretability. View details
    ML for Flood Forecasting at Scale
    Sella Nevo
    Ami Wiesel
    Guy Shalev
    Mor Schlesinger
    Oleg Zlydenko
    Ran El-Yaniv
    Yotam Gigi
    Zach Moshe
    Proceedings of the NIPS AI for Social Good Workshop (2018)
    Preview abstract Effective riverine flood forecasting at scale is hindered by a multitude of factors, most notably the need to rely on human calibration in current methodology, the limited amount of data for a specific location, and the computational difficulty of building continent/global level models that are sufficiently accurate. Machine learning (ML) is primed to be useful in this scenario: learned models often surpass human experts in complex high-dimensional scenarios, and the framework of transfer or multitask learning is an appealing solution for leveraging local signals to achieve improved global performance. We propose to build on these strengths and develop ML systems for timely and accurate riverine flood prediction. View details
    Towards Global Remote Discharge Estimation: Using the Few to Estimate The Many
    Yotam Gigi
    Guy Shalev
    Sella Nevo
    Zach Moshe
    Ami Wiesel
    Proceedings of the NeurIPS AI for Social Good Workshop (2018)
    Preview abstract Learning hydrologic models for accurate riverine flood prediction at scale is a challenge of great importance. One of the key difficulties is the need to rely on in-situ river discharge measurements, that can be quite scarce and unreliable, particularly in regions where floods cause the most damage every year. Accordingly, in this work we tackle the problem of river discharge estimation at different river locations. A core characteristic of the data at hand (e.g. satellite measurements) is that we have few measurements for many locations, all sharing the same physics that underlie the water discharge. We capture this phenomenon in a simple but powerful common mechanism regression (CMR) model that has a local component as well as a shared one that captures the global discharge mechanism. The resulting learning objective is non-convex, but we show that we can find its global optimum by leveraging the power of joining local measurements across sites. In particular, using a spectral initialization with provable near-optimal accuracy, we can find the optimum using standard descent methods. We demonstrate the efficacy of our approach for the problem of discharge estimation using simulations. View details
    Approximate Linear Programming for Logistic Markov Decision Processes
    Martin Mladenov
    Dale Schuurmans
    Ofer Meshi
    Tyler Lu
    Proceedings of the Twenty-sixth International Joint Conference on Artificial Intelligence (IJCAI-17), Melbourne, Australia (2017), pp. 2486-2493
    Preview abstract This is an extended version of the paper Logistic Markov Decision Processes that appeared in the Proceedings of the Twenty-sixth International Joint Conference on Artificial Intelligence (IJCAI-17), pp.2486-2493, Melbourne (2017). Online and mobile interactions with users, in areas such as advertising and product or content recommendation, have been transformed by machine learning techniques. However, such methods have largely focused on myopic prediction, i.e., predicting immediate user response to system actions (e.g., ads or recommendations), without explicitly accounting for the long-term impact on user behavior, nor the potential need for planning action sequences. In this work, we propose the use of Markov decision processes (MDPs) to formulate the long-term decision problem and address two key questions that emerge in their application to user interaction. The first focuses on model formulation, specifically, how best to construct MDP models of user interaction in a way that exploits the great successes of myopic prediction models. To this end, we propose a new model called logistic MDPs, an MDP formulation that allows the concise specification of transition dynamics. It does so by augmenting the natural factored form of dynamic Bayesian networks (DBNs) with user response variables that are captured by a logistic regression model (the latter being precisely the model used for myopic user interaction). The second question we address is how best to solve large logistic MDPs of this type. A variety of methods have been proposed for solving MDPs that exploit the conditional independence reflected in the DBN representations, including approximate linear programming (ALP). Despite their compact form, logistic MDPs do not admit the same conditional independence as DBNs, nor do they satisfy the linearity requirements for standard ALP. We propose a constraint generation approach to ALP for logistic MDPs that circumvents these problems by: (a) recovering compactness by conditioning on the logistic response variable; and (b) devising two procedures, one exact and one approximate, that linearize the search for violated constraints in the master LP. For the approximation procedure, we also derive error bounds on the quality of the induced policy. We demonstrate the effectiveness of our approach on advertising problems with up to several thousand sparse binarized features (up to 2^54 and 2^39 actions). View details
    Improper Deep Kernels
    Uri Heinemann
    Roi Livni
    Amir Globerson
    Proceedings of The 19th International Conference on Artificial Intelligence and Statististics. (2016)
    Preview abstract Neural networks have recently re-emerged as a powerful hypothesis class, yielding impressive classification accuracy in multiple domains. However, their training is a non-convex optimization problem which poses theoretical and practical challenges. Here we address this difficulty by turning to ``improper'' learning of neural nets. In other words, we learn a classifier that is not a neural net but is competitive with the best neural net model given a sufficient number of training examples. Our approach relies on a novel kernel construction scheme in which the kernel is a result of integration over the set of all possible instantiation of neural models. It turns out that the corresponding integral can be evaluated in closed-form via a simple recursion. Thus we translate the non-convex, hard learning problem of a neural net to a SVM with an appropriate kernel. We also provide sample complexity results which depend on the stability of the optimal neural net. View details
    Preview abstract Modern retrieval systems are often driven by an underlying machine learning model. The goal of such systems is to identify and possibly rank the few most relevant items for a given query or context. Thus, such systems are typically evaluated using a ranking-based performance metric such as the area under the precision-recall curve, the Fβ score, precision at fixed recall, etc. Obviously, it is desirable to train such systems to optimize the metric of interest. In practice, due to the scalability limitations of existing approaches for optimizing such objectives, large-scale retrieval systems are instead trained to maximize classification accuracy, in the hope that performance as measured via the true objective will also be favorable. In this work we present a unified framework that, using straightforward building block bounds, allows for highly scalable optimization of a wide range of ranking-based objectives. We demonstrate the advantage of our approach on several real-life retrieval problems that are significantly larger than those considered in the literature, while achieving substantial improvement in performance over the accuracy-objective baseline. View details
    Learning Max-Margin Tree Predictors
    Ofer Meshi
    Amir Globerson
    Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence (UAI2013) (2013), pp. 411