OpenAI Introduces CriticGPT for Better AI Training

Sousa Brothers
2 min read4 days ago

--

OpenAI has introduced CriticGPT, a new AI training tool designed to support human trainers in refining AI systems. CriticGPT aims to enhance the credibility and intelligence of chatbots by assisting trainers in evaluating intricate outputs like software code.

This tool leverages Reinforcement Learning from Human Feedback (RLHF), a method that integrates human input to adjust AI models, ensuring that AI generates coherent and accurate outputs.

CriticGPT is a specialized version of OpenAI’s GPT-4 model, trained on a dataset containing code with deliberately inserted bugs. This training enables the model to detect various coding errors.

Tests have shown that human judges preferred CriticGPT’s critiques 63 percent of the time. OpenAI plans to expand this technique to other areas beyond code review.

Beyond code reviews, CriticGPT identified errors in 24 percent of ChatGPT training data that human annotators had previously deemed flawless. CriticGPT was trained on a dataset containing code with deliberately inserted bugs, which allowed it to detect various coding errors.

Teams comprising both humans and CriticGPT delivered more comprehensive critiques and reduced inaccuracies compared to AI-only critiques.

OpenAI has also developed a new method known as Force Sampling Beam Search (FSBS) to adjust the critique thoroughness while managing false positives. This technique enhances CriticGPT’s ability to provide detailed code reviews by controlling its thoroughness in error detection and the frequency of false alarms.

While CriticGPT has limitations, having been trained on relatively short ChatGPT responses, it can reduce inaccuracies but does not eliminate them entirely. OpenAI plans to integrate CriticGPT-like tools into its RLHF labeling workflow, offering AI-assisted support to trainers.

This new strategy is part of a broader initiative to perfect large language models and ensure their behavior remains acceptable as their capabilities grow.

Competitors like Anthropic have also announced advancements in their AI models, including an upgraded version of their Claude chatbot thanks to improved training techniques and data inputs. Both companies are investigating new methods to monitor AI models to prevent undesired behaviors like deception.

Microsoft is working on improved AI training approaches, leveraging active preference elicitation to fine-tune the efficiency and precision of LLMs. Nvidia is betting on leveraging synthetic data to improve AI model training, launching Nemotron-4 340B, a series of open models crafted to produce synthetic data for training large language models (LLMs).

https://winbuzzer.com/2024/06/28/openai-introduces-criticgpt-for-better-ai-training-xcxwbn/

--

--