The Lowpass Dispatch · Tuesday, June 2, 2026

AGENTS

Multi-Agent Systems Show Diminishing Returns

New research finds that simply adding more agents to a system doesn't guarantee better performance, with collaboration overhead and flawed interaction patterns quickly becoming bottlenecks.

Increasing the number of agents in a multi-agent system does not monotonically improve performance, according to a systematic study from June 2024. Researchers found that performance follows a pattern of diminishing returns, as the benefits of collaboration are eventually outweighed by coordination overhead. The optimal number of agents depends heavily on the task type and the capability of the base model.

Further experiments on software design tasks reinforce that the structure of collaboration is more critical than the number of collaborators. A controlled study of 12 different multi-agent topologies found that adversarial and cross-model review structures consistently produced the best results. In a cross-model review, one model generates a design while a different model critiques it, a setup that ranked second-best across three different automated evaluators.

Conversely, the study found that parallel merge architectures, where multiple agents work independently and then merge their results, are "fundamentally broken." These approaches consistently ranked in the bottom tier, suffering from what researchers called the "Frankenstein effect," where poorly integrated components lead to incoherent final designs. The findings suggest that effective multi-agent systems require carefully designed interaction protocols, not just more agents.

Sources: Scaling Behavior of Single LLM-Driven Multi-Agent Systems · LLM Consortium for Software Design Refinement: A Controlled Experiment on Multi-Agent Collaboration Topologies

SECURITY

Malicious Skills Proliferate in Open Agent Ecosystems

Security researchers have uncovered hundreds of malicious skills in public agent registries, revealing a new supply-chain attack vector for AI systems that traditional scanners often miss.

A new security threat is emerging in AI agent ecosystems that allow community-contributed extensions, or "skills." A systematic analysis of 98,380 skills from two major registries identified 157 confirmed malicious skills containing 632 distinct vulnerabilities. The attacks are deliberate, with each malicious skill averaging 4.03 vulnerabilities, and often involve credential theft via remote code execution or agent manipulation through adversarial instructions hidden in documentation.

These agent-specific threats are difficult to detect with existing tools. A study of the OpenClaw ecosystem found that traditional malware scanners like VirusTotal, static analysis tools, and agent-specific risk detectors rarely agree. Only 0.69% of flagged skills were identified by all three scanner types, with 81.9% flagged by only a single scanner. This disagreement highlights a new attack surface where threats arise from natural language instructions and cross-component interactions, not just malicious code in bundled files.

One campaign, dubbed "ClawHavoc," illustrates the scale of the problem, with a single threat actor using templated brand impersonation to distribute malicious skills. As agents are given more autonomy and access to privileged tools, securing the skill supply chain has become a critical challenge for deploying reliable and safe AI systems.

Sources: "Do Not Mention This to the User": Detecting and Understanding Malicious Agent Skills · Benchmarking Security Risk Detection and Verification in Open Agentic Skill Ecosystems · ClawHub Security Signals: When VirusTotal, Static Analysis, and SkillSpector Disagree

EVALUATION

AI Benchmarks Are Saturating, Obscuring True Progress

A systematic study of 60 language model benchmarks finds nearly half have plateaued, making it increasingly difficult to differentiate between state-of-the-art models.

The benchmarks used to measure progress in AI are rapidly becoming obsolete, according to a February 2024 study. Researchers analyzed 60 language model benchmarks and found that nearly half are "saturated," meaning top models have reached performance plateaus that make it hard to distinguish meaningful improvements. The study found that saturation rates increase as benchmarks age, diminishing their long-term value for guiding research and deployment decisions.

The analysis revealed that design choices can impact a benchmark's longevity. Expert-curated benchmarks were found to be more resilient to saturation than those with publicly available test data, which can be inadvertently included in training sets. This suggests that the cost and effort of careful, expert-led construction pays dividends in creating more durable evaluation tools.

In response to this trend, researchers are developing more complex and robust benchmarks designed to test specific reasoning capabilities that current models still lack. For example, Polaris-Bench reformulates visual reasoning tasks in polar coordinates to break models' reliance on Cartesian shortcuts, while StemBind diagnoses failures in abstract visual reasoning by separating rule identification from answer selection. These efforts aim to create new measurement tools that can keep pace with model development.

Sources: When AI Benchmarks Plateau: A Systematic Study of Benchmark Saturation · The Cartesian Shortcut: Re-evaluate Vision Reasoning in Polar Coordinate Space · StemBind: When MLLMs Get Lost Between Rules and Instances in Abstract Visual Reasoning

MODELS

Direct Model Editing Damages Reasoning, Study Finds

Attempts to update an LLM's knowledge by directly modifying its weights consistently harm core capabilities, with retrieval-based methods proving superior and safer.

Directly editing the parameters of a large language model to update its knowledge is a flawed approach that damages the model's fundamental capabilities, according to a June 2024 paper. Researchers found that parameter-based knowledge editing methods consistently degrade performance on tasks requiring reasoning, even when the edits are localized.

The study provides a theoretical explanation for the failures, based on the "Dimensional Collapse Hypothesis." The theory suggests that localized parameter edits can propagate along fragile directions in the model's representation space, causing global interference that leads to a collapse in reasoning ability. This explains why a model might successfully store a new fact but lose the ability to use it in a multi-step argument.

In a comprehensive empirical evaluation, a simple retrieval-augmented generation (RAG) baseline outperformed all tested parameter-editing methods across various conditions, including knowledge complexity and the number of edits. The findings suggest that for engineers who need to keep models updated with new information, retrieval-based approaches are not only simpler but also more effective and less risky than attempting to perform surgery on a model's weights.

Sources: Revisiting Parameter-Based Knowledge Editing in Large Language Models: Theoretical Limits and Empirical Evidence

Anthropic Files Confidentially for IPO

Multi-Agent Systems Show Diminishing Returns

Malicious Skills Proliferate in Open Agent Ecosystems

AI Benchmarks Are Saturating, Obscuring True Progress

Direct Model Editing Damages Reasoning, Study Finds