Zhou, Z., Du, Z., and Chaterji, S. (2022). KRATOS: Context-Aware Cell Type Classification and Interpretation Using Joint Dimensionality Reduction and Clustering. KDD 2022.
Students
Zihan Zhou, Zijia Du
Abstract
A common workflow for single-cell RNA-sequencing (sc-RNA-seq) data analysis is to orchestrate a three-step pipeline. First, conduct a dimension reduction of the input cell profile matrix; second, cluster the cells in the latent
space; and third, extract the “gene panels” that distinguish a certain cluster from others. This workflow has the primary drawback that the three steps are performed independently, neglecting the dependencies among the steps and among
the marker genes or gene panels. In our system, Kratos, we alter the threestep workflow to a two-step one, where we jointly optimize the first two steps and add the third (interpretability) step to form an integrated sc-RNA-seq analysis
pipeline. We show that the more compact workflow of Kratos extracts marker genes that can better discriminate the target cluster, distilling underlying mechanisms guiding cluster membership. In doing so, Kratos is significantly better
than the two SOTA baselines we compare against, specifically 5.62% superior to Global Counterfactual Explanation (GCE) [ICML20], and 3.31% better than Adversarial Clustering Explanation (ACE) [ICML-21], measured by the AUROC of a kernel-SVM
classifier. We opensource our code and datasets here:
https://github.com/icanforce/single-cell-genomics-kratos.