Home > Quick > Body

AI趋势 | OpenAI发布LifeSciBench评测基准

clock
2026-06-19 15:33:39
OpenAI 发布全新评测基准 LifeSciBench,用于衡量 AI 系统在真实科研场景中的能力。

据Odaily星球日报报道,LifeSciBench 基于 750 道专家编写任务,覆盖 7 类科研工作流与 7 个生物学领域;任务来源于 173 名具有博士背景并具备生物科技或制药行业经验的科研人员。

该基准强调复杂科研能力评估,包括证据整合、实验设计、数据分析、科学推理与科研沟通等能力。超过 79%的任务包含多步骤推理,平均每道题需约 4 个推理步骤,并包含 1,062 个真实科研相关数据附件。
Disclaimer:
1. The information provided does not constitute investment advice. Investors should make independent decisions and bear all risks themselves.
2. The copyright of this content belongs to the original author. The views expressed herein are solely those of the author and do not represent the stance or position of this website.
New Tab Page - Desk3 | Plugin
Stay ahead of the game in the cryptocurrency space.