
美国密苏里大学堪萨斯城医学院Kishwar Shafin研究团队报道了利用DeepSomatic精确的体细胞小变异发现多种测序技术。相关论文于2025年10月16日发表在《自然—生物技术》杂志上。
该研究组提出了DeepSomatic,这是一种深度学习方法,用于从短读和长读数据中检测体细胞小核苷酸变异和插入和删除。该方法具有全基因组和全外显子组测序模式,可在肿瘤正常、肿瘤单一和福尔马林固定石蜡包埋样品上运行。为了训练DeepSomatic并帮助解决体细胞变异检测公开可用的培训和基准数据的缺乏,研究组生成并公开提供了六个匹配的肿瘤-正常细胞系对全基因组测序的癌症标准长读评估(CASTLE)数据集,以及基准变异集。无论是细胞系还是患者来源的样本,无论是短读测序技术还是长读测序技术,DeepSomatic都始终优于现有的调用者。
据了解,体细胞变异检测是癌症基因组学分析的重要组成部分。虽然大多数方法都集中在短读测序上,但长读技术在重复定位和变异相位方面具有潜在的优势。
附:英文原文
Title: Accurate somatic small variant discovery for multiple sequencing technologies with DeepSomatic
Author: Park, Jimin, Cook, Daniel E., Chang, Pi-Chuan, Kolesnikov, Alexey, Brambrink, Lucas, Mier, Juan Carlos, Gardner, Joshua, McNulty, Brandy, Sacco, Samuel, Keskus, Ayse G., Bryant, Asher, Ahmad, Tanveer, Shetty, Jyoti, Zhao, Yongmei, Tran, Bao, Narzisi, Giuseppe, Helland, Adrienne, Yoo, Byunggil, Pushel, Irina, Lansdon, Lisa A., Bi, Chengpeng, Walter, Adam, Gibson, Margaret, Pastinen, Tomi, Reiman, Rebecca, Mankame, Sharvari, Ranallo-Benavidez, T. Rhyker, Brown, Christine, Robine, Nicolas, Barthel, Floris P., Farooqi, Midhat S., Miga, Karen H., Carroll, Andrew, Kolmogorov, Mikhail, Paten, Benedict, Shafin, Kishwar
Issue&Volume: 2025-10-16
Abstract: Somatic variant detection is an integral part of cancer genomics analysis. While most methods have focused on short-read sequencing, long-read technologies offer potential advantages in repeat mapping and variant phasing. We present DeepSomatic, a deep-learning method for detecting somatic small nucleotide variations and insertions and deletions from both short-read and long-read data. The method has modes for whole-genome and whole-exome sequencing and can run on tumor–normal, tumor-only and formalin-fixed paraffin-embedded samples. To train DeepSomatic and help address the dearth of publicly available training and benchmarking data for somatic variant detection, we generated and make openly available the Cancer Standards Long-read Evaluation (CASTLE) dataset of six matched tumor–normal cell line pairs whole-genome sequenced with Illumina, PacBio HiFi and Oxford Nanopore Technologies, along with benchmark variant sets. Across samples, both cell line and patient-derived, and across short-read and long-read sequencing technologies, DeepSomatic consistently outperforms existing callers.
DOI: 10.1038/s41587-025-02839-x
Source: https://www.nature.com/articles/s41587-025-02839-x
	Nature Biotechnology:《自然—生物技术》,创刊于1996年。隶属于施普林格·自然出版集团,最新IF:68.164
	官方网址:https://www.nature.com/nbt/
	投稿链接:https://mts-nbt.nature.com/cgi-bin/main.plex
