Microsoft announced significant updates to its 365 Copilot platform, enhancing the Researcher agent and launching the Copilot Cowork feature.
The changes aim to improve the accuracy and effectiveness of AI-generated research reports. Microsoft emphasized the integration of evaluation alongside generation to enhance factual accuracy and presentation.
The Researcher agent now includes a new “Critique” mode that utilizes two separate models: one for generating reports and another for expert review. “By giving evaluation as much emphasis as generation,” the company stated, “this architecture creates a powerful feedback loop that delivers higher-quality results across factual accuracy, analytical breadth, and presentation.” This mode will be enabled by default, although users can opt for a single model from OpenAI or Anthropic for the full process.
Microsoft claims that the updated Critique mode has outperformed on a benchmark developed by Perplexity that assesses research models for accuracy, completeness, and objectivity.
The company also introduced a “Council” mode, which employs both an Anthropic model and an OpenAI model to generate reports side-by-side. After both reports are produced, a dedicated judge model evaluates them, summarizing key findings and noting agreements and divergences.
Additionally, the new Copilot Cowork feature is now in early access through the Frontier program, based on Anthropic’s Claude Cowork tool. This program allows users to interact with tools that are still under development.





