Home > Quick > Body

OpenAI releases CoT monitoring to stop malicious behavior of large models

clock
2025-03-10 22:50:05
OpenAI has released the latest research. Using CoT (thinking chain) monitoring, it can prevent malicious behaviors such as big models talking nonsense and hiding their true intentions. It is also one of the effective tools for monitoring super models. OpenAI uses the newly released cutting-edge model o3-mini as the monitored object, and uses the weaker GPT-4o model as the monitor. The testing environment is a coding task, requiring AI to implement functions in the code base to pass the unit test. The results show that the CoT monitor performs well when detecting systematic "reward hacking" behavior, with a recall rate of 95%, far exceeding the 60% of monitoring behavior alone.
Disclaimer:
1. The information provided does not constitute investment advice. Investors should make independent decisions and bear all risks themselves.
2. The copyright of this content belongs to the original author. The views expressed herein are solely those of the author and do not represent the stance or position of this website.
New Tab Page - Desk3 | Plugin
Stay ahead of the game in the cryptocurrency space.