- Félix Merlin Angelo Philippe Raimundo
- Celine Vallot
- Jean-Philippe Vert
Many computational methods have been developed recently to analyze single-cell RNA-seq (scRNA-seq) data. Several benchmark studies have compared these methods on their ability for dimensionality reduction, clustering or differential analysis, often relying on default parameters. Yet given the biological diversity of scRNA-seq datasets, parameter tuning might be essential for the optimal usage of methods, and determining how to tune parameters remains an unmet need. Here, we propose a benchmark to assess the performance of five methods, systematically varying their tunable parameters, for dimension reduction (DR) of scRNA-seq data, a common first step to many downstream applications such as cell type identification or trajectory inference. We run a total of ∼1.5 million experiments to assess the influence of parameter changes on the performance of each method, and propose two strategies to automatically tune parameters for methods that need it. We find that principal component analysis (PCA)-based methods like scran and Seurat are competitive with default parameters but do not benefit much from parameter tuning, while more complex models like ZinbWave, DCA and scVI can reach better performance but after parameter tuning.