近日,德国海因里希海涅大学教授Alisandra K. Denton及其研究小组报道了Helixer,结合深度学习和隐马尔可夫模型的初级真核基因模型的从头计算预测。相关论文于2025年11月24日发表于国际顶尖学术期刊《自然—方法学》杂志上。
在这里,该课题组人员展示了Helixer,一种基于人工智能的从头开始基因预测工具,可以在真菌,植物,脊椎动物和无脊椎动物基因组中提供高精度的基因模型。与传统方法不同,Helixer不需要额外的实验数据,如RNA测序,使其广泛适用于不同的物种。课题组表明,Helixer的预训练模型达到了与当前工具相当或超过当前工具的精度,产生了与多个评估指标中专家推荐的参考文献密切匹配的基因注释。它的设计可以在没有再训练的情况下立即对基因组进行主题分析,为研究和应用环境中的基因组注释提供了一种有效的、可访问的解决方案。该工具是一个开放源码软件,可以通过GitHub本地安装。在线网络界面也可以通过Galaxy ToolShed获得。通过利用深度学习和隐马尔可夫模型,Helixer实现了对真菌、植物、脊椎动物和无脊椎动物真核生物基因组从头开始基因注释的广泛分类覆盖。
据了解,基因的准确鉴定对于理解生物功能至关重要,但这在许多新测序或研究较少的物种中仍然具有挑战性。
附:英文原文
Title: Helixer: ab initio prediction of primary eukaryotic gene models combining deep learning and a hidden Markov model
Author: Holst, Felix, Bolger, Anthony M., Kindel, Felicitas, Gnther, Christopher, Ma, Janina, Triesch, Sebastian, Kiel, Niklas, Saadat, Nima, Ebenhh, Oliver, Usadel, Bjrn, Schwacke, Rainer, Weber, Andreas P. M., Bolger, Marie E., Denton, Alisandra K.
Issue&Volume: 2025-11-24
Abstract: The accurate identification of genes is vital for understanding biological function, yet this remains challenging across many newly sequenced or less-studied species. Here we present Helixer, an artificial intelligence-based tool for ab initio gene prediction that delivers highly accurate gene models across fungal, plant, vertebrate and invertebrate genomes. Unlike traditional methods, Helixer operates without requiring additional experimental data such as RNA sequencing, making it broadly applicable to diverse species. We show that Helixer’s pretrained models achieve accuracy on par with or exceeding current tools, producing gene annotations that closely match expert-curated references across multiple evaluation metrics. Its design enables immediate use on genomes without retraining, providing an efficient, accessible solution for genome annotation in both research and applied settings. The tool is available as an open-source software for local installation via GitHub. An online web interface is also available as well as through the Galaxy ToolShed. By leveraging both deep learning and hidden Markov models, Helixer achieves broad taxonomic coverage for ab initio gene annotation of eukaryotic genomes from fungi, plants, vertebrates and invertebrates.
DOI: 10.1038/s41592-025-02939-1
Source: https://www.nature.com/articles/s41592-025-02939-1
Nature Methods:《自然—方法学》,创刊于2004年。隶属于施普林格·自然出版集团,最新IF:47.99
官方网址:https://www.nature.com/nmeth/
投稿链接:https://mts-nmeth.nature.com/cgi-bin/main.plex
