当前位置:科学网首页 > 小柯机器人 >详情
科学家开发出用于人类病理学的多模态生成式AI助手
作者:小柯机器人 发布时间:2024/6/16 22:52:54

美国哈佛医学院Faisal Mahmood团队开发出用于人类病理学的多模态生成式AI助手。相关论文于2024年6月12日在线发表在《自然》杂志上。

研究人员介绍一款用于人类病理学的视觉语言通用人工智能(AI)助手PathChat。研究人员通过调整病理学基础视觉编码器,将其与预训练的大型语言模型相结合,并在由999202个问答回合组成的超过456000种不同视觉语言指令上对整个系统进行微调,从而构建了PathChat。研究人员将PathChat与几种多模态视觉语言AI助手和GPT4V进行了比较,后者为商用多模态通用AI助手ChatGPT-4提供了动力。PathChat在不同组织来源和疾病模型的多选诊断问题上取得了一流的性能。

此外,通过使用开放式问题和人类专家评估,研究人员发现总体而言,PathChat对病理学相关的各种询问做出了更准确、更受病理学家青睐的回答。作为一款可灵活处理视觉和自然语言输入的交互式通用视觉语言AI助手软件,PathChat有可能在病理学教育、研究和人类在环临床决策中找到有影响力的应用。

据了解,计算病理学领域在开发特定任务预测模型和任务诊断自监督视觉编码器方面取得了显著进展。然而,尽管生成式AI呈爆炸式增长,但针对病理学构建通用、多模态AI助手和辅助驾驶的研究却十分有限。

附:英文原文

Title: A Multimodal Generative AI Copilot for Human Pathology

Author: Lu, Ming Y., Chen, Bowen, Williamson, Drew F. K., Chen, Richard J., Zhao, Melissa, Chow, Aaron K., Ikemura, Kenji, Kim, Ahrong, Pouli, Dimitra, Patel, Ankush, Soliman, Amr, Chen, Chengkuan, Ding, Tong, Wang, Judy J., Gerber, Georg, Liang, Ivy, Le, Long Phi, Parwani, Anil V., Weishaupt, Luca L., Mahmood, Faisal

Issue&Volume: 2024-06-12

Abstract: The field of computational pathology[1,2] has witnessed remarkable progress in the development of both task-specific predictive models and task-agnostic self-supervised vision encoders[3,4]. However, despite the explosive growth of generative artificial intelligence (AI), there has been limited study on building general purpose, multimodal AI assistants and copilots[5] tailored to pathology. Here we present PathChat, a vision-language generalist AI assistant for human pathology. We build PathChat by adapting a foundational vision encoder for pathology, combining it with a pretrained large language model and finetuning the whole system on over 456,000 diverse visual language instructions consisting of 999,202 question-answer turns. We compare PathChat against several multimodal vision language AI assistants and GPT4V, which powers the commercially available multimodal general purpose AI assistant ChatGPT-4[7]. PathChat achieved state-of-the-art performance on multiple-choice diagnostic questions from cases of diverse tissue origins and disease models. Furthermore, using open-ended questions and human expert evaluation, we found that overall PathChat produced more accurate and pathologist-preferable responses to diverse queries related to pathology. As an interactive and general vision-language AI Copilot that can flexibly handle both visual and natural language inputs, PathChat can potentially find impactful applications in pathology education, research, and human-in-the-loop clinical decision making.

DOI: 10.1038/s41586-024-07618-3

Source: https://www.nature.com/articles/s41586-024-07618-3

期刊信息

Nature:《自然》,创刊于1869年。隶属于施普林格·自然出版集团,最新IF:69.504
官方网址:http://www.nature.com/
投稿链接:http://www.nature.com/authors/submit_manuscript.html