Yuan Xue
Authored Publications
Sort By
Preview abstract
Stochastic dual dynamic programming~(SDDP) is one of the state-of-the-art algorithm for multi-stage stochastic optimization, yet its cost exponentially increases w.r.t. the size of decision variables, therefore, quickly becomes inapplicable for high-dimension problems. We introduce a neuralized component into SDDP, which outputs a \emph{piece-wise linear function} in a \emph{low-dimension} space to approximate the value function, based on the \emph{context of the problem instances}. The neuralized component will consistently evolve to abstract effective low-dimension action space and improve the quality of value function approximation for each problem based on prior successful experiences. It is seamlessly integrated with SDDP, formed our neural enhanced solver,~\AlgName~(\algshort), which achieves the optimality \emph{without loss of accuracy} in \emph{faster speed} for high-dimension and long-horizon multi-stage stochastic optimizations. We conduct thorough empirical experiments to demonstrate the benefits of \algshort from transferability on scalability.~\algshort significantly outperforms the competitors, including SDDP and variants of RL algorithms, in terms of solution quality and feasibility, and computational speed.
View details
Multi-task prediction of organ dysfunction in the ICU using sequential sub-network routing
Diana Mincu
Eric Loreaux
Anne Mottram
Hugh Montgomery
Ali Connell
Nenad Tomašev
Martin Seneviratne
Journal of the American Medical Informatics Association (JAMIA) (2021)
Preview abstract
Introduction:
Multi-task learning (MTL) using electronic health records (EHRs) allows concurrent prediction of multiple endpoints. MTL has shown promise in improving model performance and training efficiency; however it often suffers from negative transfer - impaired learning if tasks are not appropriately selected. We introduce a sequential sub-network routing (SeqSNR) architecture which uses soft parameter sharing to find related tasks and encourage cross-learning between them.
Materials and Methods:
Using the Medical Information Mart for Intensive Care (MIMIC-III) dataset, we train deep neural network models to predict the onset of six endpoints including specific organ dysfunctions and general clinical outcomes: acute kidney injury, continuous renal replacement therapy, mechanical ventilation, vasoactive medications, mortality, and length of stay. We compare single task models (ST) with naive multi-task (shared bottom, SB) and SeqSNR in terms of discriminative performance and label efficiency.
Results:
SeqSNR showed a modest yet statistically significant performance boost across at least 4 out of 6 tasks compared to SB and ST. When the size of the training dataset was reduced for a given task, SeqSNR outperformed ST for all cases showing an average AU PRC boost of 2.1%, 2.9%, and 2.1% for tasks using 1%, 5%, and 10% of labels respectively.
Discussion and Conclusion:
Multi-task learning has variable performance compared to single-task learning, with the possibility for negative transfer. The SeqSNR architecture outperforms SB and ST in discriminative performance and shows superior performance in terms of label efficiency. SeqSNR should be considered for multi-task predictive modeling using EHR data.
View details
Learning the Graphical Structure of Electronic Health Records with Graph Convolutional Transformer
Edward Choi
Zhen Xu
Yujia Li
Gerardo Flores
Association for the Advancement of Artificial Intelligence (AAAI) (2020)
Preview abstract
Effective modeling of electronic health records (EHR) is rapidly becoming an important topic in both academia and industry. A recent study showed that using the graphical structure underlying EHR data (e.g. relationship between diagnoses and treatments) improves the performance of prediction tasks such as heart failure prediction. However, EHR data do not always contain complete structure information. Moreover, when it comes to claims data, structure information is completely unavailable to begin with. Under such circumstances, can we still do better than just treating EHR data as a flat-structured bag-of-features? In this paper, we study the possibility of jointly learning the hidden structure of EHR while performing supervised prediction tasks on EHR data. Specifically, we discuss that Transformer is a suitable basis model to learn the hidden EHR structure, and propose Graph Convolutional Transformer, which uses data statistics to guide the structure learning process. The proposed model consistently outperformed previous approaches empirically, on both synthetic data and publicly available EHR data, for various prediction tasks such as graph reconstruction and readmission prediction, indicating that it can serve as an effective general-purpose representation learning algorithm for EHR data.
View details
Preview abstract
Capturing the inter-dependencies among multiple types of clinically-critical events is critical not only to accurate future event prediction, but also to better treatment planning. In this work, we propose a deep latent state-space generative model to capture the interactions among different types of correlated clinical events (e.g., kidney failure, mortality) by explicitly modeling the temporal dynamics of patients' latent states. Based on these learned patient states, we further develop a new general discrete-time formulation of the hazard rate function to estimate the survival distribution of patients with significantly improved accuracy. Extensive evaluations over real EMR data show that our proposed model compares favorably to various state-of-the-art baselines. Furthermore, our method also uncovers meaningful insights about the latent correlations among mortality and different types of organ failures.
View details
Predicting inpatient medication orders from electronic health record data
Kathryn Rough
Kun Zhang
Atul J. Butte
Alvin Rajkomar
Clinical Pharmacology and Therapeutics (2020)
Preview abstract
In a general inpatient population, we predicted patient‐specific medication orders based on structured information in the electronic health record (EHR). Data on over three million medication orders from an academic medical center were used to train two machine‐learning models: A deep learning sequence model and a logistic regression model. Both were compared with a baseline that ranked the most frequently ordered medications based on a patient’s discharge hospital service and amount of time since admission. Models were trained to predict from 990 possible medications at the time of order entry. Fifty‐five percent of medications ordered by physicians were ranked in the sequence model’s top‐10 predictions (logistic model: 49%) and 75% ranked in the top‐25 (logistic model: 69%). Ninety‐three percent of the sequence model’s top‐10 prediction sets contained at least one medication that physicians ordered within the next day. These findings demonstrate that medication orders can be predicted from information present in the EHR.
View details
Preview abstract
The paradigm of pretraining' from a set of relevant auxiliary tasks and thenfinetuning' on a target task has been successfully applied in many different domains. However, when the auxiliary tasks are abundant, with complex relationships to the target task, using domain knowledge or searching over all possible pretraining setups are inefficient strategies. To address this challenge, we propose a method to automatically select from a large set of auxiliary tasks which yield a representation most useful to the target task. In particular, we develop an efficient algorithm that uses automatic auxiliary task selection within a nested-loop meta-learning process. We have applied this algorithm to the task of clinical outcome predictions in electronic medical records, learning from a large number of self-supervised tasks related to forecasting patient trajectories. Experiments on a real clinical dataset demonstrate the superior predictive performance of our method compared to direct supervised learning, naive pretraining and multitask learning, in particular in low-data scenarios when the primary task has very few examples. With detailed ablation analysis, we further show that the selection rules are interpretable and able to generalize to unseen target tasks with new data.
View details
Preview abstract
Accurate identification and localization of abnormalities from radiology images play an integral part in clinical diagnosis and treatment planning. Building a highly accurate prediction model for these tasks usually requires a large number of images manually annotated with labels and finding sites of abnormalities. In reality, however, such annotated data are expensive to acquire, especially the ones with location annotations. We need methods that can work well with only a small amount of location annotations. To address this challenge, we present a unified approach that simultaneously performs disease identification and localization through the same underlying model for all images. We demonstrate that our approach can effectively leverage both class information as well as limited location annotation, and significantly outperforms the comparative reference baseline in both classification and localization tasks.
View details