
Yichen Zhou
Statistician working on foundation models for structured data.
Research Areas
Authored Publications
Sort By
Preview abstract
We pioneer the study of in-context training for time-series foundation models. We create finetuning examples that not only include the usual (context, horizon) pairs for forecasting; but also related time-series examples in-context. We finetune a pretrained time-series foundation model on the type of in-context examples mentioned above. Our training is decoder-only and can adapt not only to any context, horizon pair (up to a certain maximum context) but also to any number of supplementary time-series examples (again up to a certain maximum number of examples). Appropriately trained models can then learn to borrow patterns from these related examples to do better on the original forecasting task. We show that this opens up interesting features like the ability to prompt the time-series foundation model with different related examples. This can help the finetuned model to adapt to specific features of a dataset at inference time. We show that such adaptions can lead to better zero-shot performance on popular forecasting benchmarks as compared to supervised deep learning methods, statistical models as well as other time-series foundation models.
View details
Preview abstract
Motivated by recent advances in large language models for NLP, we design a time-series foundation model for forecasting whose out-of-the-box zero-shot performance on a variety of datasets, matches the accuracy of state-of-the-art supervised forecasting models for each individual dataset. Our model is based on pretraining a patched-decoder style attention model on a large time series dataset, and can work well across different forecasting history lengths, prediction lengths and temporal granularities.
View details
Preview abstract
In machine learning applications such as ranking fairness or fairness over intersectional groups, one often encounters optimization problems with an extremely large number of constraints. In particular, with ranking fairness tasks, there may even be a variable number of constraints, e.g. one for each query in the training set. In these cases, the standard approach of optimizing a Lagrangian while maintaining one Lagrange multiplier per constraint may no longer be practical. Our proposal is to associate a feature vector with each constraint, and to learn a "multiplier model" that maps each such vector to the corresponding Lagrange multiplier. We prove optimality and feasibility guarantees under assumptions on the flexibility of the multiplier model, and empirically demonstrate that our method is effective on real-world case studies.
View details