Jump to Content
Joonseok Lee

Joonseok Lee

Joonseok Lee is a research engineer in Foresight group at Google Research. He is mainly working on multi-modal video representation learning. He earned his Ph. D. in Computer Science from Georgia Institute of Technology in August 2015, under the supervision of Dr. Guy Lebanon and Prof. Hongyuan Zha. His thesis is about local approaches for collaborative filtering, with recommendation systems as the main application. He has done three internships during Ph.D, including Amazon (2014 Summer), Microsoft Research (2014 Spring), and Google (2013 Summer). Before coming to Georgia Tech, he worked in NHN corp. in Korea (2007-2010). He received his B.S degree in computer science and engineering from Seoul National University, Korea. His paper "Local Collaborative Ranking" received the best student paper award from the 23rd International World Wide Web Conference (2014). He has served as a program committee in many conferences including NIPS, ICML, CVPR, ICCV, AAAI, WSDM, and CIKM, and journals including JMLR, ACM TIST, and IEEE TKDE. He co-organized the YouTube-8M Large-Scale Video Understanding Workshop as a program chair, and served as the publicity chair for AISTATS 2015 conference. He is currently serving as a reviewer for Google Faculty Research Awards Program. More information is available in his website (http://www.joonseok.net).
Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    MuLan: A Joint Embedding of Music Audio and Natural Language
    Qingqing Huang
    Ravi Ganti
    Judith Yue Li
    Proceedings of the the 23rd International Society for Music Information Retrieval Conference (ISMIR) (2022) (to appear)
    Preview abstract Music tagging and content-based retrieval systems have traditionally been constructed using pre-defined ontologies covering a rigid set of music attributes or text queries. This paper presents MuLan: a first attempt at a new generation of acoustic models that link music audio directly to unconstrained natural language music descriptions. MuLan takes the form of a two-tower, joint audio-text embedding model trained using 44 million music recordings (370K hours) and weakly-associated, free-form text annotations. Through its compatibility with a wide range of music genres and text styles (including conventional music tags), the resulting audio-text representation subsumes existing ontologies while graduating to true zero-shot functionalities. We demonstrate the versatility of the MuLan embeddings with a range of experiments including transfer learning, zero-shot music tagging, language understanding in the music domain, and cross-modal retrieval applications. View details
    A Conservative Approach for Unbiased Learning on Unknown Biases
    Myeongho Jeon
    Daekyung Kim
    Woochul Lee
    Myungjoo Kang
    Proceedings of the 38th IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
    Preview abstract Although convolutional neural networks (CNNs) achieve state-of-the-art in image classification, recent works address their unreliable predictions due to their excessive dependence on biased training data. Existing unbiased modeling postulates that the bias in the dataset is obvious to know, but it is actually unsuited for image datasets including countless sensory attributes. To mitigate this issue, we present a new scenario that does not necessitate a predefined bias. Under the observation that CNNs do have multi-variant and unbiased representations in the model, we propose a conservative framework that employs this internal information for unbiased learning. Specifically, this mechanism is implemented via hierarchical features captured along the multiple layers and orthogonal regularization. Extensive evaluations on public benchmarks demonstrate our method is effective for unbiased learning. View details
    Towards Detailed Characteristic-Preserving Virtual Try-On
    Sangho Lee
    Seoyoung Lee
    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), The 5th Workshop on Computer Vision for Fashion, Art, and Design (2022)
    Preview abstract While virtual try-on has rapidly progressed recently, existing virtual try-on methods still struggle to faithfully represent various details of the clothes when worn. In this paper, we propose a simple yet effective method to better preserve details of the clothing and person by introducing an additional fitting step after geometric warping. This minimal modification enables disentangling representations of the clothing from the wearer, hence we are able to preserve the wearer-agnostic structure and details of the clothing, to fit a garment naturally to a variety of poses and body shapes. Moreover, we propose a novel evaluation framework applicable to any metric, to better reflect the semantics of clothes fitting. From extensive experiments, we empirically verify that the proposed method not only learns to disentangle clothing from the wearer, but also preserves details of the clothing on the try-on results. View details
    S-Walk: Accurate and Scalable Session-based Recommendation with Random Walks
    Minjin Choi
    Jinhong Kim
    Hyunjung Shim
    Jongwuk Lee
    Proceedings of the 15th ACM International Conference on Web Search and Data Mining (WSDM), ACM (2022)
    Preview abstract Session-based recommendation (SR) aims at predicting the next items from a sequence of the previous items consumed by an anonymous user. Most existing SR models focus only on modeling intra-session characteristics, but neglect to consider inter-session relationships of items, helpful for improving the accuracy. Another critical aspect of recommender systems is computational efficiency and scalability, considering practical concerns in commercial applications. In this paper, we propose the novel Session-based Recommendation with Random Walk, namely S-Walk. Specifically, S-Walk can effectively capture both intra- and inter-session correlations on items by handling high-order relationships across items using random walks with restart (RWR). At the same time, S-Walk is highly efficient and scalable by adopting linear models with closed-form solutions for transition and teleportation matrices to formulate RWR. Despite its simplicity, our extensive experiments demonstrate that S-Walk achieves comparable or state-of-the-art performances in various metrics on four benchmark datasets. Moreover, the learned model by S-walk can be highly compressed without sacrificing accuracy, achieving two or more orders of magnitude faster inference than existing DNN-based models, particularly suitable for large-scale commercial systems. View details
    Bilateral Self-unbiased Recommender Learning for Missing-not-at-Random Implicit Feedback
    Jaewoong Lee
    Seongmin Park
    Jongwuk Lee
    Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), ACM (2022) (to appear)
    Preview abstract Unbiased recommender learning aims at eliminating the intrinsic bias from implicit feedback under the missing-not-at-random (MNAR) assumption. Existing studies primarily focus on estimating the propensity score for item popularity bias but neglect to address the exposure bias of items caused by recommender models, i.e., when the recommender model renders an item more frequently, users tend to more click the item. To resolve this issue, we propose a novel unbiased recommender learning framework, namely Bilateral self-unbiased recommender (BISER). Concretely, BISER consists of two parts: (i) estimating self-inverse propensity weighting (SIPW) for the exposure bias during model training and (ii) utilizing bilateral unbiased learning (BU) to minimize the difference for model predictions between user- and item-based models, thereby alleviating the high variance from SIPW. Our extensive experiments show that BISER significantly outperforms state-of-the-art unbiased recommender models on various real-world datasets, such as Coat, Yahoo! R3, MovieLens-100K, and CiteULike. View details
    Session-aware Linear Item-Item Models for Session-based Recommendation
    Minjin Choi
    Jinhong Kim
    Hyunjung Shim
    Jongwuk Lee
    Proceedings of the ACM Conference on the Web (2021)
    Preview abstract Session-based recommendation aims at predicting the next item given a sequence of previous items consumed in the session. (e.g., e-commerce or multimedia streaming services) Specifically, session data exhibits its unique characteristics, i.e., session consistency, sequential dependency, repeated item consumption, and timeliness of sessions. In this paper, we propose simple-yet-effective session-aware linear models, considering the holistic aspects of the sessions. This holistic nature of our model helps improve the quality of recommendations, and more importantly provides a generalized framework for various session data. Thanks to the closed-form solution for the linear models, the proposed models are highly scalable. Experimental results demonstrate that our simple linear models show comparable or state-of-the-art performance in various metrics on multiple real-world datasets. View details
    Local Collaborative Autoencoders
    Minjin Choi
    Yoongi Jeong
    Jongwuk Lee
    Proceedings of the 14th ACM International Conference on Web Search and Data Mining (WSDM), ACM (2021)
    Preview abstract Top-N recommendation is a challenging problem because complex and sparse user-item interactions should be adequately addressed to achieve high-quality recommendation results. The local latent factor approach has been successfully used with multiple local models to capture diverse user preferences with different subcommunities. However, previous studies have not fully explored the potential of local models, and failed to identify many small and coherent sub-communities. In this paper, we present Local Collaborative Autoencoders (LOCA), a generalized local latent factor framework. Specifically, LOCA adopts different neighborhood ranges at the training and inference stages. Besides, LOCA uses a novel sub-community discovery method, maximizing the coverage of a union of local models and employing a large number of diverse local models. By adopting autoencoders as the base model, LOCA captures latent non-linear patterns representing meaningful user-item interactions within sub-communities. Our experimental results demonstrate that LOCA is scalable and outperforms state-of-the-art models on several public benchmarks, by 2.99-4.70% in Recall and 1.02-7.95% in NDCG, respectively. View details
    Continuous-Time Video Generation via Learning Motion Dynamics with Neural ODE
    Kangyeol Kim
    Sunghyun Park
    Junsoo Lee
    Sookyung Kim
    Jaegul Choo
    Edward Choi
    Proceedings of the 32nd British Machine Vision Conference (BMVC) (2021)
    Preview abstract In order to perform unconditional video generation, we must learn the distribution of the real-world videos. In an effort to synthesize high-quality videos, various studies at-tempted to learn a mapping function between noise and videos, including recent efforts to separate motion distribution and appearance distribution. Previous methods, how-ever, learn motion dynamics in discretized, fixed-interval timesteps, which is contrary to the continuous nature of motion of a physical body. In this paper, we propose a novel video generation approach that learns separate distributions for motion and appearance, the former modeled by neural ODE to learn natural motion dynamics. Specifically, we employ a two-stage approach where the first stage con-verts a noise vector to a sequence of keypoints in arbitrary frame rates, and the second stage synthesizes videos based on the given keypoints sequence and the appearance noise vector. Our model not only quantitatively outperforms re-cent baselines for video generation in both fixed and varying frame rates, but also demonstrates versatile functionality such as dynamic frame rate manipulation and motion transfer between two datasets, thus opening new doors to diverse video generation applications. View details
    Vid-ODE: Continuous-Time Video Generation with Neural Ordinary Differential Equation
    Sunghyun Park
    Kangyeol Kim
    Junsoo Lee
    Jaegul Choo
    Sookyung Kim
    Edward Choi
    Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI) (2021)
    Preview abstract Video generation models often operate under the assumption of fixed frame rates, which leads to suboptimal performance when it comes to handling flexible frame rates (e.g., increasing the frame rate of the more dynamic portion of the video as well as handling missing video frames). To resolve the restricted nature of existing video generation models' ability to handle arbitrary timesteps, we propose continuous-time video generation by combining neural ODE (Vid-ODE) with pixel-level video processing techniques. Using ODE-ConvGRU as an encoder, a convolutional version of the recently proposed neural ODE, which enables us to learn continuous-time dynamics, Vid-ODE can learn the spatio-temporal dynamics of input videos of flexible frame rates. The decoder integrates the learned dynamics function to synthesize video frames at any given timesteps, where the pixel-level composition technique is used to maintain the sharpness of individual frames. With extensive experiments on four real-world video datasets, we verify that the proposed Vid-ODE outperforms state-of-the-art approaches under various video generation settings, both within the trained time range (interpolation) and beyond the range (extrapolation). To the best of our knowledge, Vid-ODE is the first work successfully performing continuous-time video generation using real-world videos. View details
    Saving Face: Investigating the Ethical Concerns of Facial Recognition Auditing
    Inioluwa Deborah Raji
    Timnit Gebru
    Margaret Mitchell
    Joy Buolamwini
    Proceedings of the 3rd AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society (AIES), ACM (2020)
    Preview abstract Although essential to revealing biased performance, well intentioned attempts at algorithmic auditing can have effects that may harm the very populations these measures are meant to protect. This concern is even more salient while auditing biometric systems such as facial recognition, where the data is sensitive and the technology is often used in ethically questionable manners. We demonstrate a set of five ethical concerns in the particular case of auditing commercial facial processing technology, highlighting additional design considerations and ethical tensions the auditor needs to be aware of so as not exacerbate or complement the harms propagated by the audited system. We go further to provide tangible illustrations of these concerns, and conclude by reflecting on what these concerns mean for the role of the algorithmic audit and the fundamental product limitations they reveal. View details
    Large Scale Video Representation Learning via Relational Graph Clustering
    Hyodong Lee
    Joe Yue-Hei Ng
    Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
    Preview abstract Representation learning is widely applied for various tasks on multimedia data, e.g., retrieval and search. One approach for learning useful representation is by utilizing the relationships or similarities between examples. In this work, we explore two promising scalable representation learning approaches on video domain. With hierarchical graph clusters built upon video-to-video similarities, we propose: 1) smart negative sampling strategy that significantly boosts training efficiency with triplet loss, and 2) a pseudo-classification approach using the clusters as pseudo-labels. The embeddings trained with the proposed methods are competitive on multiple video understanding tasks, including related video retrieval and video annotation. Both of these proposed methods are highly scalable, as verified by experiments on large-scale datasets. View details
    N-GCN: Multi-scale Graph Convolution for Semi-supervised Node Classification
    Sami Abu-El-Haija
    Amol Kapoor
    Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI) (2019)
    Preview abstract Graph Convolutional Networks (GCNs) have shown significant improvements in semi-supervised learning on graph-structured data. Concurrently, unsupervised learning of graph embeddings has benefited from the information contained in random walks. In this paper, we propose a model: Network of GCNs (NGCN), which marries these two lines of work. At its core, N-GCN trains multiple instances of GCNs over node pairs discovered at different distances in random walks, and learns a combination of the instance outputs which optimizes the classification objective. Our experiments show that our proposed N-GCN model improves state-of-the-art baselines on all of the challenging node classification tasks we consider: Cora, Citeseer, Pubmed, and PPI. In addition, our proposed method has other desirable properties, including generalization to recently proposed semi-supervised learning methods such as GraphSAGE, allowing us to propose N-SAGE, and resilience to adversarial input perturbations. View details
    Learning to Focus and Track Extreme Climate Events
    Sookyung Kim
    Sunghyun Park
    Sunghyo Chung
    Yunsung Lee
    Hyojin Kim
    Mr Prabhat
    Jaegul Choo
    BMVC (2019)
    Preview abstract This paper tackles the task of extreme climate event tracking. It has unique challenges compared to other visual object tracking problems, including a wider range of spatio-temporal dynamics, the unclear boundary of the target, and the shortage of a labeled dataset. We propose a simple but robust end-to-end model based on multi-layered ConvLSTMs, suitable for climate event tracking. It first learns to imprint the location and the appearance of the target at the first frame in an auto-encoding fashion. Next, the learned feature is fed to the tracking module to track the target in subsequent time frames. To tackle the data shortage problem, we propose data augmentation based on conditional generative adversarial networks. Extensive experiments show that the proposed framework significantly improves tracking performance of a hurricane tracking task over several state-of-the-art methods. View details
    Large-Scale Training Framework for Video Annotation
    Seong Jae Hwang
    Balakrishnan Varadarajan
    Ariel Gordon
    Proc. of the 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), ACM (2019)
    Preview abstract Video is one of the richest sources of information available online but extracting deep insights from video content at internet scale is still an open problem, both in terms of depth and breadth of understanding, as well as scale. Over the last few years, the field of video understanding has made great strides due to the availability of large-scale video datasets and core advances in image, audio, and video modeling architectures. However, the state-of-the-art architectures on small scale datasets are frequently impractical to deploy at internet scale, both in terms of the ability to train such deep networks on hundreds of millions of videos, and to deploy them for inference on billions of videos. In this paper, we present a MapReduce-based training framework, which exploits both data parallelism and model parallelism to scale training of complex video models. The proposed framework uses alternating optimization and full-batch fine-tuning, and supports large Mixture-of-Experts classifiers with hundreds of thousands of mixtures, which enables a trade-off between model depth and breadth, and the ability to shift model capacity between shared (generalization) layers and per-class (specialization) layers. We demonstrate that the proposed framework is able to reach state-of-the-art performance on the largest public video datasets, YouTube-8M and Sports-1M, and can scale to 100 times larger datasets. View details
    Collaborative Deep Metric Learning for Video Understanding
    Balakrishnan Varadarajan
    Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, ACM (2018)
    Preview abstract The goal of video understanding is to develop algorithms that enable machines understand videos at the level of human experts. Researchers have tackled various domains including video classification, search, personalized recommendation, and more. However, there is a research gap in combining these domains in one unified learning framework. Towards that, we propose a deep network that embeds videos using their audio-visual content, onto a metric space which preserves video-to-video relationships. Then, we use the trained embedding network to tackle various domains including video classification and recommendation, showing significant improvements over state-of-the-art baselines. The proposed approach is highly scalable to deploy on large-scale video sharing platforms like YouTube. View details
    Local Topic Discovery via Boosted Ensemble of Nonnegative Matrix Factorization
    Sangho Suh
    Jaegul Choo
    Chandan K. Reddy
    Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), Sister conferences track (2017)
    Preview abstract Nonnegative matrix factorization (NMF) has been increasingly popular for topic modeling of large-scale documents. However, the resulting topics often represent only general, thus redundant information about the data rather than minor, but potentially meaningful information to users. To tackle this problem, we propose a novel ensemble model of nonnegative matrix factorization for discovering high-quality local topics. Our method leverages the idea of an ensemble model to successively perform NMF given a residual matrix obtained from previous stages and generates a sequence of topic sets. The novelty of our method lies in the fact that it utilizes the residual matrix inspired by a state-of-the-art gradient boosting model and applies a sophisticated local weighting scheme on the given matrix to enhance the locality of topics, which in turn delivers high-quality, focused topics of interest to users. View details
    Large-Scale Content-Only Video Recommendation
    International Conference on Computer Vision Workshop, Computer Vision Foundation (2017), pp. 987 - 995
    Preview abstract Traditional recommendation systems using collaborative filtering (CF) approaches work relatively well when the candidate videos are sufficiently popular. With the increase of user-created videos, however, recommending fresh videos gets more and more important, but pure CF-based systems may not perform well in such cold-start situation. In this paper, we model recommendation as a video content-based similarity learning problem, and learn deep video embeddings trained to predict video relationships identified by a co-watch-based system but using only visual and audial content. The system does not depend on availability on video meta-data, and can generalize to both popular and tail content, including new video uploads. We demonstrate performance of the proposed method in large-scale datasets, both quantitatively and qualitatively. View details
    Preview abstract Many recent advancements in Computer Vision are attributed to large datasets. Open-source software packages for Machine Learning and inexpensive commodity hardware have reduced the barrier of entry for exploring novel approaches at scale. It is possible to train models over millions of examples within a few days. Although large-scale datasets exist for image understanding, such as ImageNet, there are no comparable size video classification datasets. In this paper, we introduce YouTube-8M, the largest multi-label video classification dataset, composed of ~8 million videos---500K hours of video---annotated with a vocabulary of 4803 visual entities. To get the videos and their (multiple) labels, we used the YouTube Data APIs. We filtered the video labels (Freebase topics) using both automated and manual curation strategies, including by asking Mechanical Turk workers if the labels are visually recognizable. Then, we decoded each video at one-frame-per-second, and used a Deep CNN pre-trained on ImageNet to extract the hidden representation immediately prior to the classification layer. Finally, we compressed the frame features and make both the features and video-level labels available for download. The dataset contains frame-level features for over 1.9 billion video frames and 8 million videos, making it the largest public multi-label video dataset. We trained various (modest) classification models on the dataset, evaluated them using popular evaluation metrics, and report them as baselines. Despite the size of the dataset, some of our models train to convergence in less than a day on a single machine using the publicly-available TensorFlow framework. We plan to release code for training a basic TensorFlow model and for computing metrics. We show that pre-training on large data generalizes to other datasets like Sports-1M and ActivityNet. We achieve state-of-the-art on ActivityNet, improving mAP from 53.8% to 77.8%. We hope that the unprecedented scale and diversity of YouTube-8M will lead to advances in video understanding and representation learning. View details
    L-EnsNMF: Boosted Local Topic Discovery via Ensemble of Nonnegative Matrix Factorization
    Sangho Suh
    Jaegul Choo
    Chandan K. Reddy
    Proceedings of the IEEE International Conference on Data Mining (ICDM) (2016)
    Preview abstract Nonnegative matrix factorization (NMF) has been widely applied in many domains. In document analysis, it has been increasingly used in topic modeling applications, where a set of underlying topics are revealed by a low-rank factor matrix from NMF. However, it is often the case that the resulting topics give only general topic information in the data, which tends not to convey much information. To tackle this problem, we propose a novel ensemble model of nonnegative matrix factorization for discovering high-quality local topics. Our method leverages the idea of an ensemble model, which has been successful in supervised learning, into an unsupervised topic modeling context. That is, our model successively performs NMF given a residual matrix obtained from previous stages and generates a sequence of topic sets. Our algorithm for updating the input matrix has novelty in two aspects. The first lies in utilizing the residual matrix inspired by a state-of-the-art gradient boosting model, and the second stems from applying a sophisticated local weighting scheme on the given matrix to enhance the locality of topics, which in turn delivers high-quality, focused topics of interest to users. We evaluate our proposed method by comparing it against other topic modeling methods, such as a few variants of NMF and latent Dirichlet allocation, in terms of various evaluation measures representing topic coherence, diversity, coverage, computing time, and so on. We also present qualitative evaluation on the topics discovered by our method using several real-world data sets. View details
    LLORMA: Local Low-Rank Matrix Approximation
    Guy Lebanon
    Yoram Singer
    Samy Bengio
    Journal of Machine Learning Research (JMLR), vol. 17 (2016), pp. 1-24
    Preview abstract Matrix approximation is a common tool in recommendation systems, text mining, and computer vision. A prevalent assumption in constructing matrix approximations is that the partially observed matrix is low-rank. In this paper, we propose, analyze, and experiment with two procedures, one parallel and the other global, for constructing local matrix approximations. The two approaches approximate the observed matrix as a weighted sum of low-rank matrices. These matrices are limited to a local region of the observed matrix. We analyze the accuracy of the proposed local low-rank modeling. Our experiments show improvements in prediction accuracy over classical approaches for recommendation tasks. View details
    Content-based Related Video Recommendations
    Nisarg Kothari
    Advances in Neural Information Processing Systems (NIPS) Demonstration Track (2016)
    Preview abstract This is a demo of related video recommendations, seeded from random YouTube videos, and based purely on video content signals. Traditional recommendation systems using collaborative filtering (CF) approaches suggest related videos for a given seed based on how many users have watched a particular candidate video right after watching the seed video. This does not take the video content into account but relies on aggregate user behavior. Traditional CF approaches work very well when the seed and the candidate videos are relatively popular – they must be watched in a sequence by many users in order for them to be identified as related by the CF system. In this demo, we focus on the cold-start problem, where either the seed and/or the candidate video are freshly uploaded (or undiscovered) so the CF system cannot identify any related videos for them. Being able to recommend freshly uploaded videos as well as recommend good related videos for fresh video seeds are important for improving freshness and user engagement. We model this as a video content-based similarity learning problem, and learn deep video embeddings trained to predict ground-truth video relationships (identified by a CF co-watch-based system) but using only visual content. The system does not depend on availability on video metadata or any click information, and can generalize to both popular and tail content, as well as new video uploads. It embeds any new video into a 1024-dimensional representation based on its content and pairwise video similarity is computed simply as a dot product in the embedding space. We show that the learned video embeddings generalize beyond simple visual similarity and are able to capture complex semantic relationships. View details
    Local Collaborative Ranking
    Samy Bengio
    Guy Lebanon
    Yoram Singer
    Proceedings of the 23rd International World Wide Web Conference (WWW), ACM (2014)
    Preview abstract Personalized recommendation systems are used in a wide variety of applications such as electronic commerce, social networks, web search, and more. Collaborative filtering approaches to recommendation systems typically assume that the rating matrix (e.g., movie ratings by viewers) is low-rank. In this paper, we examine an alternative approach in which the rating matrix is \emph{locally low-rank}. Concretely, we assume that the rating matrix is low-rank within certain neighborhoods of the metric space defined by (user, item) pairs. We combine a recent approach for local low-rank approximation based on the Frobenius norm with a general empirical risk minimization for ranking losses. Our experiments indicate that the combination of a mixture of local low-rank matrices each of which was trained to minimize a ranking loss outperforms many of the currently used state-of-the-art recommendation systems. Moreover, our method is easy to parallelize, making it a viable approach for large scale real-world rank-based recommendation systems. View details
    Matrix Approximation under Local Low-Rank Assumption
    Guy Lebanon
    Yoram Singer
    The Learning Workshop in International Conference on Learning Representations (ICLR) (2013)
    Preview abstract Matrix approximation is a common tool in machine learning for building accurate prediction models for recommendation systems, text mining, and computer vision. A prevalent assumption in constructing matrix approximations is that the partially observed matrix is of low-rank. We propose a new matrix approximation model where we assume instead that the matrix is only locally of low-rank, leading to a representation of the observed matrix as a weighted sum of low-rank matrices. We analyze the accuracy of the proposed local low-rank modeling. Our experiments show improvements of prediction accuracy in recommendation tasks. View details
    Local Low-Rank Matrix Approximation
    Guy Lebanon
    Yoram Singer
    Proceedings of the 30th International Conference on Machine Learning (ICML), Journal of Machine Learning Research (2013)
    Preview abstract Matrix approximation is a common tool in recommendation systems, text mining, and computer vision. A prevalent assumption in constructing matrix approximations is that the partially observed matrix is of low-rank. We propose a new matrix approximation model where we assume instead that the matrix is locally of low-rank, leading to a representation of the observed matrix as a weighted sum of low-rank matrices. We analyze the accuracy of the proposed local low-rank modeling. Our experiments show improvements in prediction accuracy over classical approaches for recommendation tasks. View details
    Estimating Temporal Dynamics of Human Emotions
    Guy Lebanon
    Haesun Park
    Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI) (2015)
    Leveraging Knowledge Bases for Contextual Entity Exploration
    Ariel Fuxman
    Bo Zhao
    Yuanhua Lv
    Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), ACM (2015)
    Local Context Sparse Coding
    Guy Lebanon
    Haesun Park
    Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI) (2015)
    Local Approaches for Collaborative Filtering
    Ph.D. Thesis (2015), pp. 1-158
    A Rapid Screening and Testing Protocol for Keyboard Layout Speed Comparison
    Hanggjun Cho
    R. I. (Bob) McKay
    IEEE Transactions on Human-Machine Systems, vol. 45 (2014), pp. 2168-2291
    Learning Multiple-Question Decision Trees for Cold-Start Recommendation
    Mingxuan Sun
    Fuxin Li
    Ke Zhou
    Guy Lebanon
    Hongyuan Zha
    Proceedings of the 6th ACM International Conference on Web Search and Data Mining (WSDM), ACM (2013)
    Automatic Feature Induction for Stagewise Collaborative Filtering
    Mingxuan Sun
    Guy Lebanon
    Advances in Neural Information Processing Systems (NIPS) (2012)
    Rapid Screening of Keyboard Layouts
    Hanggjun Cho
    R. I. (Bob) McKay
    Proceedings of IEEE International Conference on Systems, Man, and Cybernetics (SMC), IEEE (2012)
    A Comparative Study of Collaborative Filtering Algorithms
    Mingxuan Sun
    Guy Lebanon
    ArXiv 1205:3193 (2012)
    PREA: Personalized Recommendation Algorithms Toolkit
    Mingxuan Sun
    Guy Lebanon
    Journal of Machine Learning Research (JMLR), vol. 13 (2012), pp. 2699-2703
    Optimizing a Personalized Cellphone Keypad
    R. I. (Bob) McKay
    Proceedings of the 5th International Conference on Convergence and Hybrid Information Technology (ICHIT) (2011), pp. 237-244