当前位置:科学网首页 > 小柯机器人 >详情
作者:小柯机器人 发布时间:2024/3/5 14:03:08

数据集不平衡对单细胞数据整合的影响,这一成果由加拿大大学健康网络Bo Wang、Hassaan Maan和多伦多大学Kieran R. Campbell研究组合作取得。该研究于2024年3月1日发表于国际学术期刊《自然-生物技术》杂志。





Title: Characterizing the impacts of dataset imbalance on single-cell data integration

Author: Maan, Hassaan, Zhang, Lin, Yu, Chengxin, Geuenich, Michael J., Campbell, Kieran R., Wang, Bo

Issue&Volume: 2024-03-01

Abstract: Computational methods for integrating single-cell transcriptomic data from multiple samples and conditions do not generally account for imbalances in the cell types measured in different datasets. In this study, we examined how differences in the cell types present, the number of cells per cell type and the cell type proportions across samples affect downstream analyses after integration. The Iniquitate pipeline assesses the robustness of integration results after perturbing the degree of imbalance between datasets. Benchmarking of five state-of-the-art single-cell RNA sequencing integration techniques in 2,600 integration experiments indicates that sample imbalance has substantial impacts on downstream analyses and the biological interpretation of integration results. Imbalance perturbation led to statistically significant variation in unsupervised clustering, cell type classification, differential expression and marker gene annotation, query-to-reference mapping and trajectory inference. We quantified the impacts of imbalance through newly introduced properties—aggregate cell type support and minimum cell type center distance. To better characterize and mitigate impacts of imbalance, we introduce balanced clustering metrics and imbalanced integration guidelines for integration method users.

DOI: 10.1038/s41587-023-02097-9

Source: https://www.nature.com/articles/s41587-023-02097-9


Nature Biotechnology:《自然—生物技术》,创刊于1996年。隶属于施普林格·自然出版集团,最新IF:68.164