Hang Qi
Research Areas
Authored Publications
Sort By
Federated Learning for Affective Computing Tasks
Brian Eoff
Alan Cowen
Josh Belanich
IEEE (2022), pp. 1-8
Preview abstract
Federated learning mitigates the need to store user data in a central datastore for machine learning tasks, and is particularly beneficial when working with sensitive user data or tasks. Although successfully used for applications such as improving keyboard query suggestions, it is not studied systematically for modeling affective computing tasks which are often laden with
subjective labels and high variability across individuals/raters or even by the same participant. In this paper, we study the federated averaging algorithm FedAvg to model self-reported
emotional experience and perception labels on a variety of speech, video and text datasets. We identify two learning paradigms that commonly arise in affective computing tasks: modeling of selfreports (user-as-client), and modeling perceptual judgments such as labeling sentiment of online comments (rater-as-client). In the user-as-client setting, we show that FedAvg generally performs on-par with a non-federated model in classifying self-reports. In the rater-as-client setting, FedAvg consistently performed poorer than its non-federated counterpart. We found that the performance of FedAvg degraded for classes where the interrater agreement was moderate to low. To address this finding, we propose an algorithm FedRater that learns client-specific label distributions in federated settings. Our experimental results show that FedRater not only improves the overall classification performance compared to FedAvg but also provides insights for estimating proxies of inter-rater agreement in distributed settings.
View details
A Field Guide to Federated Optimization
Jianyu Wang
Gauri Joshi
Maruan Al-Shedivat
Galen Andrew
A. Salman Avestimehr
Katharine Daly
Deepesh Data
Suhas Diggavi
Hubert Eichner
Advait Gadhikar
Antonious M. Girgis
Filip Hanzely
Chaoyang He
Samuel Horvath
Martin Jaggi
Tara Javidi
Satyen Chandrakant Kale
Sai Praneeth Karimireddy
Jakub Konečný
Sanmi Koyejo
Tian Li
Peter Richtarik
Karan Singhal
Virginia Smith
Mahdi Soltanolkotabi
Weikang Song
Sebastian Stich
Ameet Talwalkar
Hongyi Wang
Blake Woodworth
Honglin Yuan
Mi Zhang
Tong Zhang
Chunxiang (Jake) Zheng
Chen Zhu
arxiv (2021)
Preview abstract
Federated learning and analytics are a distributed approach for collaboratively learning models (or statistics) from decentralized data, motivated by and designed for privacy protection. The distributed learning process can be formulated as solving federated optimization problems, which emphasize communication efficiency, data heterogeneity, compatibility with privacy and system requirements, and other constraints that are not primary considerations in other problem settings. This paper provides recommendations and guidelines on formulating, designing, evaluating and analyzing federated optimization algorithms through concrete examples and practical implementation, with a focus on conducting effective simulations to infer real-world performance. The goal of this work is not to survey the current literature, but to inspire researchers and practitioners to design federated learning algorithms that can be used in various practical applications.
View details
Preview abstract
Federated Learning enables visual models to be trained on-device, bringing advantages for user privacy (data need never leave the device), but challenges in terms of data diversity and quality. Whilst typical models in the datacenter are trained using data that are independent and identically distributed (IID), data at source are typically far from IID. In this work, we characterize the effect this non-identical distribution has on distributed learning, using as a benchmark the standard Federated Averaging (FedAvg) algorithm. To do so, we introduce two new large-scale datasets for species and landmark classification, with realistic per-user data splits that simulate real-world edge learning scenarios. We also develop two new algorithms (FedVC, FedIR) that intelligently resample and reweight over the client pool, bringing large improvements in accuracy and stability in training.
View details
Advances and Open Problems in Federated Learning
Brendan Avent
Aurélien Bellet
Mehdi Bennis
Arjun Nitin Bhagoji
Graham Cormode
Rachel Cummings
Rafael G.L. D'Oliveira
Salim El Rouayheb
David Evans
Josh Gardner
Adrià Gascón
Phillip B. Gibbons
Marco Gruteser
Zaid Harchaoui
Chaoyang He
Lie He
Zhouyuan Huo
Justin Hsu
Martin Jaggi
Tara Javidi
Gauri Joshi
Mikhail Khodak
Jakub Konečný
Aleksandra Korolova
Farinaz Koushanfar
Sanmi Koyejo
Tancrède Lepoint
Yang Liu
Prateek Mittal
Richard Nock
Ayfer Özgür
Rasmus Pagh
Ramesh Raskar
Dawn Song
Weikang Song
Sebastian U. Stich
Ziteng Sun
Florian Tramèr
Praneeth Vepakomma
Jianyu Wang
Li Xiong
Qiang Yang
Felix X. Yu
Han Yu
Arxiv (2019)
Preview abstract
Federated learning (FL) is a machine learning setting where many clients (e.g., mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g., service provider), while keeping the training data decentralized. FL embodies the principles of focused data collection and minimization, and mitigates many of the systemic privacy risks and costs resulting from traditional, centralized machine learning and data science approaches. Motivated by the explosive growth in FL research, this paper discusses recent advances and presents a comprehensive list of open problems and challenges.
View details
Preview abstract
Federated Learning brings the possibility to train visual models in a privacy-preserving way using real-world data on mobile devices. Given their distributed nature, the statistics of the data across these devices is likely to differ significantly. In this work, we look at the effect such non-identical data distributions has on visual classification via Federated Learning. We propose a way to synthesize datasets with a continuous range of identicalness and provide performance measures for the Federated Averaging algorithm. We also provide an improvement to the algorithm when its performance falls off. Experiments on the CIFAR-10 dataset show that such modifications lead to better learning on all setups. In highly skewed settings, we are able to improve performance up to 166%, achieving comparable results to traditional data-center learning in all but the most extreme cases.
View details
Preview abstract
Human vision is able to immediately recognize novel visual categories after seeing just one or a few training examples. We describe how to add a similar capability to ConvNet classifiers by directly setting the final layer weights from novel training examples during low-shot learning. We call this process \emph{weight imprinting} as it directly sets penultimate layer weights based on an appropriately scaled copy of their activations for that training example. The imprinting process provides a valuable complement to training with stochastic gradient descent, as it provides immediate good classification performance and an initialization for any further fine tuning in the future. We show how this imprinting process is related to proxy-based embeddings. However, it differs in that only a single imprinted weight vector is learned for each novel category, rather than relying on a nearest-neighbor distance to training instances as typically used with embedding methods. Our experiments show that using averaging of imprinted weights provides better generalization than using nearest-neighbor instance embeddings. A key change to traditional ConvNet classifiers is the introduction of a scaled normalization layer that allows activations to be directly imprinted as weights.
View details