斯坦福大学William J. Greenleaf小组的研究认为多组学和深度学习剖析了人类发展中的规则语法。该研究于2026年4月8日发表于国际一流学术期刊《自然》杂志上。
在这里,课题组展示了人类发育多组图谱,这是一个单细胞染色质可及性和基因表达的图谱,来自12个器官的817,740个胎儿细胞,跨越203种细胞类型和超过100万个候选顺式调控元件,其中许多具有器官特异性的体内增强子活性。深度学习模型被训练来预测本地DNA序列的可及性,揭示了影响可及性的基序的综合词汇,包括表现出不同句法约束的复合基序,这些基序被预测为介导转录因子的协同性。
课题组研究人员确定了“硬”语法规则需要精确的基序间距和方向,“软”规则允许灵活的基序排列,以及无处不在的基序抑制可及性。基于模型的遗传变异解释表明,具有积极和消极影响的基序的破坏与基因表达的一致性影响有关。他们的工作描述了基序语法如何控制细胞类型特异性染色质的可及性,并为解码顺式调控逻辑和解释人类发育过程中的遗传变异提供了基础抵抗。
据了解,转录因子在发育过程中通过以序列特异性方式结合调节性DNA来建立细胞身份,通常促进局部染色质可及性并调节基因表达。绘制可获得的染色质图谱为转录控制提供了重要的见解,但人类发育的可用数据集仅限于大块组织、单一器官或单一形态。
附:英文原文
Title: Multiomics and deep learning dissect regulatory syntax in human development
Author: Liu, Betty B., Jessa, Selin, Kim, Samuel H., Ng, Yan Ting, Higashino, Soon Il, Marinov, Georgi K., Chen, Derek C., Parks, Benjamin E., Li, Li, Nguyen, Tri C., Wang, Austin T., Wang, Sean K., Tan, Meng How, Tan, Serena Y., Kosicki, Michael, Pennacchio, Len A., Ben-David, Eyal, Pasca, Anca M., Kundaje, Anshul, Farh, Kyle K. H., Greenleaf, William J.
Issue&Volume: 2026-04-08
Abstract: Transcription factors establish cell identity during development by binding regulatory DNA in a sequence-specific manner, often promoting local chromatin accessibility and regulating gene expression1. Mapping accessible chromatin offers critical insights into transcriptional control, but available datasets for human development are restricted to bulk tissue, single organs or single modalities2. Here we present the Human Development Multiomic Atlas, a single-cell atlas of chromatin accessibility and gene expression from 817,740 fetal cells across 12 organs, spanning 203 cell types and more than 1 million candidate cis-regulatory elements, many of which exhibit organ-specific in vivo enhancer activity. Deep learning models trained to predict accessibility from local DNA sequence unravel a comprehensive lexicon of motifs that influence accessibility, including composite motifs exhibiting distinct syntactic constraints that are predicted to mediate transcription factor cooperativity. We identify ‘hard’ syntactic rules requiring precise motif spacing and orientation, ‘soft’ rules allowing flexible motif arrangements, and ubiquitous motifs inhibiting accessibility. Model-based interpretation of genetic variants reveals that disruption of motifs with positive and negative effects is associated with concordant effects on gene expression. Our work delineates how motif syntax governs cell-type-specific chromatin accessibility and provides a foundational resource for decoding cis-regulatory logic and interpreting genetic variation during human development.
DOI: 10.1038/s41586-026-10326-9
Source: https://www.nature.com/articles/s41586-026-10326-9
Nature:《自然》,创刊于1869年。隶属于施普林格·自然出版集团,最新IF:69.504
官方网址:http://www.nature.com/
投稿链接:http://www.nature.com/authors/submit_manuscript.html
