Jing (David) Dai
My interests include:
Spatial/Spatio-temporal Data Management and Analytics, Concurrency Control, Sustainability Management, Intelligent Transportation Systems, Geographic Information Systems.
Authored Publications
Sort By
Preview abstract
We propose an equity-aware GRAph-fusion differentiable Pooling neural network to accurately predict the spatio-temporal urban mobility (e.g., station-level bike usage in terms of departures and arrivals) with Equity (GRAPE). GRAPE consists of two independent hierarchical graph neural networks for two mobility systems — one as a target graph (i.e., a bike sharing system) and the other as an auxiliary graph (e.g., a taxi system). We have designed a convolutional fusion mechanism to jointly fuse the target and auxiliary graph embeddings and extract the shared spatial and temporal mobility patterns within the embeddings to enhance prediction accuracy. To further improve the equity of bike sharing systems for diverse communities, we focus on the bike resource allocation and model prediction performance, and propose to regularize the predicted bike resource as well as the accuracy across advantaged and disadvantaged communities, and thus mitigate the potential unfairness in the predicted bike sharing usage. Our evaluation of over 23 million bike rides and 100 million taxi trips in New York City and Chicago has demonstrated GRAPE to outperform all of the baseline approaches in terms of prediction accuracy (by 15.80% for NYC and 50.55% for Chicago on average) and social equity awareness (by 32.44% and 24.43% in terms of resource fairness for NYC and Chicago, and 13.36% and 16.52% in terms of performance fairness).
View details
Multimodal Storytelling via Generative Adversarial Imitation Learning
Zhiqian Chen
Xuchao Zhang
Arnold Boedihardjo
Chang-Tien Lu
The Twenty-Sixth International Joint Conference on Artificial Intelligence (2017), pp. 3967-3973
Preview abstract
Deriving event storylines is an effective summarization method to succinctly organize extensive information, which can significantly alleviate the pain of information overload. The critical challenge is the lack of widely recognized definition of storyline metric. Prior studies have developed various approaches based on different assumptions about users’ interests. These works can extract interesting patterns, but their assumptions do not guarantee that the derived patterns will match users’ preference. On the other hand, their exclusiveness of single modality source misses cross-modality information. This paper proposes a method, multimodal imitation learning via Generative Adversarial Networks(MIL-GAN), to directly model users’ interests as reflected by various data. In particular, the proposed model addresses the critical challenge by imitating users’ demonstrated storylines. Our proposed model is designed to learn the reward patterns given user-provided storylines and then applies the learned policy to unseen data. The proposed approach is demonstrated to be capable of acquiring the user’s implicit intent and outperforming competing methods by a substantial margin with a user study.
View details
Trendi: Tracking Stories in News and Microblogs via Emerging, Evolving and Fading Topics
Xuchao Zhang
Liang Zhao
Zhiqian Chen
Arnold Boedihardjo
Chang-Tien Lu
IEEE BigData Conference 2017
Preview abstract
In today’s era of information overload, people are struggling to detect the evolution of hot topics from massive news media and microblogs such as Twitter. Reports from mainstream news agencies and discussions from microblogs could complement each other to form a complete picture of major events. Existing work has generally focused on a single source, seldom attempting to combine multiple sources to track the evolution of topics: emerging, evolving and fading phrases as this would require a considerably more sophisticated model. This paper proposes a novel story discovery model that integrates evolutionary topics in news and Twitter data sources using an incremental algorithm by 1) discovering complementary information from news and microblogs that provides a more complete view of major events; 2) modeling emerging, evolving and fading topics and features throughout ongoing events; and 3) creating a scalable algorithm that is capable of handling massive data from news and social media. The parameters of the new model are optimized using a novel algorithm based on the alternative direction method of multipliers (ADMM). Extensive experimental evaluations on multiple datasets from different domains demonstrate the effectiveness and efficiency of our proposed approach.
View details
Unsupervised Spatial Event Detection in Targeted Domains with Applications to Civil Unrest Modeling
Liang Zhao
Feng Cheng
Ting Hua
Chang-Tien Lu
Naren Ramakrishnan
PLOS ONE, 9 (2014), pp. 1-12
Preview abstract
Twitter has become a popular data source as a surrogate for monitoring and detecting events. Targeted domains such as crime, election, and social unrest require the creation of algorithms capable of detecting events pertinent to these domains. Due to the unstructured language, short-length messages, dynamics, and heterogeneity typical of Twitter data streams, it is technically difficult and labor-intensive to develop and maintain supervised learning systems. We present a novel unsupervised approach for detecting spatial events in targeted domains and illustrate this approach using one specific domain, viz. civil unrest modeling. Given a targeted domain, we propose a dynamic query expansion algorithm to iteratively expand domain-related terms, and generate a tweet homogeneous graph. An anomaly identification method is utilized to detect spatial events over this graph by jointly maximizing local modularity and spatial scan statistics. Extensive experiments conducted in 10 Latin American countries demonstrate the effectiveness of the proposed approach.
View details
Student-t based Robust Spatio-Temporal Prediction
Yang Chen
Feng Chen
T. Charles Clancy
Yao-Jan Wu
IEEE 12th International Conference on Data Mining, IEEE, Brussels, Belgium (2012), pp. 151-160
Preview abstract
This paper describes an efficient and effective design of Robust Spatio-Temporal Prediction based on Student’s t distribution, namely, St-RSTP, to provide estimations based on observations over spatio-temporal neighbors. The proposed St-RSTP is more resilient to outliers or other small departures from model assumptions than its ancestor, the Spatio-Temporal Random Effects (STRE) model. STRE is a state-of-the-art statistical model with linear order complexity for large scale processing. However, it assumes Gaussian observations, which has the well-known limitation of non-robustness. In our St-RSTP design, the measurement error follows Student’s t distribution, instead of a traditional Gaussian distribution. This design reduces the influence of outliers, improves prediction quality, and keeps the problem analytically intractable. We propose a novel approximate inference approach, which approximates the model into the form that separates the high dimensional latent variables into groups, and then estimates the posterior distributions of different groups of variables separately in the framework of Expectation Propagation. As a good property, our approximate approach degeneralizes to the standard STRE based prediction, when the degree of freedom of the Student’s t distribution is set to infinite. Extensive experimental evaluations based on both simulation and real-life data sets demonstrated the robustness and the efficiency of our Student-t prediction model. The proposed approach provides critical functionality for stochastic processes on spatio-temporal data.
View details
An Integrated Framework for Spatio-Temporal-Textual Search and Mining
Bingsheng Wang
Haili Dong
Arnold Boedihardjo
Chang-Tien Lu
Harland Yu
Ing-Ray Chen
20th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL GIS 2012), ACM, 2 Penn Plaza, Suite 701, New York, NY 10121, pp. 570-573
Preview abstract
This paper presents an integrated framework for Spatio-Temporal-Textual (STT) information retrieval and knowledge discovery system. The proposed ensemble framework contains an efficient STT search engine with multiple indexing, ranking and scoring schemes, an effective STT pattern miner with Spatio-Temporal (ST) analytics, and novel STT topic modeling. Specifically, we design an effective prediction prototype with a third-order linear regression model, and present an innovative STT topic modeling relevance ranker to score documents based on inherent STT features under topical space. We demonstrate the framework with a crime dataset from the Washington, DC area from 2006 to 2010 and a global terrorism dataset from 2004 to 2010.
View details