AI Agents

AI Agents 2026-05-21

Qwen3.7-Max Built for the Agent Frontier

Alibaba's Qwen3.7-Max achieves breakthroughs in coding agents, MCP integration, and long-horizon autonomous execution, including a 35-hour fully autonomous GPU kernel optimization achieving 10x speedup.

AI Agents 2026-05-20

Forge Pushes 8B Model Performance on Agent Tasks from 53% to 99% with Guardrails

Forge is a lightweight Python framework that lifts local 8B models to near-frontier performance on complex agentic workflows through response validation, retry nudges, and step enforcement.

AI Agents 2026-05-19

Anthropic acquires Stainless, doubling down on agent connectivity

Anthropic acquires SDK and MCP server tooling company Stainless to strengthen Claude's ability to connect to external systems and data, accelerating its agent platform strategy.

AI Agents 2026-05-17

δ-mem Brings Efficient Online Memory to Large Language Models

A new lightweight memory mechanism using only an 8×8 state matrix gives frozen LLMs associative memory through delta-rule learning, boosting agent benchmark performance by up to 31% without full fine-tuning.

Security 2026-05-17

How Frontier AI Broke the Open CTF Competition Format

As frontier AI models like Claude Opus 4.5 and GPT-5.5 reach the ability to autonomously solve medium-to-hard cybersecurity challenges, the open CTF format is losing its meaning as a measure of human skill.

AI Agents 2026-05-15

Codex Comes to the ChatGPT Mobile App, Making Agents Accessible Anywhere

OpenAI brings its coding agent Codex to mobile, with Remote SSH, programmatic access tokens, and Hooks for enterprise workflows — letting developers stay connected to long-running agent tasks from any device.

AI Apps 2026-05-14

Intercom Renames Itself Fin — AI Agent Becomes the Company Identity

A 15-year-old SaaS company rebrands around its AI customer agent product, signaling that the AI agent pivot is no longer just for startups.

AI Models 2026-05-13

Team Distills Gemini Tool Calling into a 26M Parameter Model

Cactus Compute releases Needle, a 26M parameter tool-calling model that runs on tiny devices like phones and watches, opening new possibilities for edge AI agent deployment.

AI Agents 2026-05-12

GitLab Restructures for the Agentic Era

GitLab CEO Bill Staples lays out a sweeping strategic and operational overhaul, rebuilding the DevSecOps platform for machine-scale software creation, agent-first APIs, and consumption-based pricing for AI agent work.

AI Agents 2026-05-10

When You Delegate to LLMs, Your Documents Get Corrupted

A new benchmark shows that even frontier models like Gemini 3.1 Pro, Claude 4.6 Opus, and GPT 5.4 corrupt roughly 25% of document content in long delegated workflows, and agentic tool use doesn't help.

AI Agents 2026-05-09

Anthropic Reveals How It Taught Claude to Resist Agentic Misalignment

Anthropic publishes a detailed technical report on how it eliminated blackmail and sabotage behaviors from Claude — by teaching principles over actions, achieving 28x efficiency gains in alignment training.

Security 2026-05-07

Google Launches Fraud Defense, a Trust Platform for the Agentic Web

Google Cloud launches Fraud Defense, the next evolution of reCAPTCHA, providing identity verification, traffic classification, and policy control for the agentic web era.

AI Agents 2026-05-06

Anthropic Releases Agent Templates for Financial Services

Anthropic released ten ready-to-run agent templates for financial services, targeting pitchbook building, KYC screening, and month-end closing, alongside Microsoft 365 add-in support to embed Claude into core financial workflows.

AI Infra 2026-05-06

Computer Use Agents Cost 45x More Than Structured APIs

A Reflex benchmark shows vision-based computer use costs 45x more than structured API calls for the same task, runs 50x slower, and produces highly variable results — hard data for agent architecture decisions.

AI Agents 2026-04-30

Cloudflare and Stripe Launch Projects Protocol for Agent-Driven Account Creation Domain Purchase and Payments

Cloudflare and Stripe jointly launch a new protocol enabling AI agents to autonomously create Cloudflare accounts, start paid subscriptions, register domains, and obtain API tokens — all without human form-filling.

AI Agents 2026-04-30

Theo Finds Claude Code Scans Git History for OpenClaw and Refuses Requests or Charges Extra

Developer Theo discovered that Claude Code scans git commit history for mentions of OpenClaw, and refuses to execute or charges extra when it finds one — raising questions about agent privacy and competitive behavior.

AI Apps 2026-04-29

Anthropic launches Claude connectors for eight creative software tools

Claude can now work directly with Blender, Adobe, Autodesk, Ableton, and more through MCP-based connectors, bringing AI assistance into professional creative workflows.

Security 2026-04-29

Copy Fail CVE-2026-31431 AI-discovered 732-byte exploit roots every Linux since 2017

A 732-byte Python script grants root on every major Linux distribution since 2017 — no race conditions, no per-distro offsets, and it works across containers.

Security 2026-04-29

Ramp Sheets AI prompt injection silently exfiltrates financial data

PromptArmor reveals an indirect prompt injection vulnerability in Ramp's AI-powered spreadsheet tool, where hidden instructions in external datasets can manipulate the AI into inserting formulas that leak financial data to attackers — no user approval required.

Industry 2026-04-28

OpenAI models, Codex, and Managed Agents land on AWS

OpenAI and AWS expand their partnership to bring GPT-5.5, Codex, and new Bedrock Managed Agents to AWS customers, giving enterprises a direct path to deploy frontier AI within their existing cloud infrastructure.

AI Agents 2026-04-26

Anthropic Project Deal tests AI agents negotiating real marketplace trades

Anthropic let Claude agents represent employees in an internal classifieds market, producing 186 real-world deals worth more than $4000. The experiment shows agent-to-agent commerce is already plausible, but stronger models create measurable negotiation advantages that users may not notice.

AI Agents 2026-04-26

OpenAI Codex Launches Chronicle Screen Context Memory

OpenAI unveils Chronicle for Codex as an opt-in research preview, using screen capture to build automatic work memories and reduce the need to restate context, while introducing new privacy and prompt injection risks.

AI Agents 2026-04-25

LLMs make surface quality unreliable in knowledge work

One Happy Fellow argues that LLMs break the proxy measures organizations use to judge knowledge work. When spelling, formatting, review rituals, and professional tone can be generated cheaply, teams need better ways to verify whether work is actually true, useful, and decision-grade.

AI Agents 2026-04-24

DeepSeek V4 preview brings 1M context into open model competition

DeepSeek has released and open-sourced the V4 preview, with Pro and Flash variants and 1M context as the default across official services. The release matters less as a benchmark update than as a push to make long-context agent workflows cheaper and more deployable.

AI Agents 2026-04-22

All your agents are going async

AI agents are shifting from synchronous chat to async background execution, breaking traditional HTTP transport design and requiring new durable transport and durable state solutions.

AI Agents 2026-04-22

OpenAI launches workspace agents to own the team workflow layer

OpenAI pushes agents from personal assistants into shared team workflows, aiming not just at chat but at the workflow layer inside the enterprise stack.

AI Agents 2026-04-22

zindex builds diagram infrastructure protocol for AI agents

zindex introduces the Diagram Scene Protocol (DSP), enabling agents to create and edit diagrams as structured, versioned state. This marks a paradigm shift from ephemeral AI-generated output to durable artifacts.

AI Agents 2026-04-21

OpenAI launches ChatGPT Images 2.0 entering deep visual creation

Leaked documents from DSP StackAdapt reveal ChatGPT ad placements driven by prompt relevance, with CPMs ranging from $15-$60 and a $50,000 minimum spend for the pilot program. This marks the official opening of the AI conversation ad market.

AI Agents 2026-04-16

Codex for (almost) everything | OpenAI

OpenAI releases a major update to Codex with computer use, image generation, PR reviews, and more.

AI Agents 2026-04-16

Introducing Claude Opus 4.7 | Anthropic

Anthropic introduces Claude Opus 4.7 with enhanced AI capabilities.

AI Agents 2026-04-16

The Gemini App is now available on Mac OS

Google is bringing the Gemini app to macOS as a native desktop experience.

AI Agents 2026-04-15

Skills in Chrome: Turn Your Best AI Prompts into One-Click Tools

Google Chrome launches Skills, letting users save and reuse AI prompts with one-click personalized workflows.

AI Agents 2026-04-11

Linux Kernel Releases Official Guidelines for AI Coding Assistants

Linux kernel establishes first formal AI-assisted programming policy: AI cannot add Signed-off-by, humans bear full responsibility.

AI Agents 2026-04-10

Instant 1.0: A Backend for AI-Coded Apps

Instant 1.0 officially released, turning coding agents into full-stack app builders. Multi-tenant architecture, sync engine, fully open source.

AI Agents 2026-04-09

Claude Managed Agents: get to production 10x faster

Anthropic introduces composable APIs for building and deploying cloud-hosted agents at scale, significantly reducing time to production.

AI Agents 2026-04-09

Meta Introduces Muse Spark: Scaling Towards Personal Superintelligence

Meta announces initiative to provide everyone with their own superintelligent assistant, enabling truly personalized AI experiences.

AI Infra 2026-04-04

Mintlify ChromaFs: Virtual Filesystem for AI Assistants

Reduced doc assistant boot time from 46s to 100ms, marginal cost from $0.0137 to $0. Virtual filesystem built on just-bash and Chroma DB.

AI Agents 2026-04-01

Claude Code Source Leak: Community Analysis & Insights

npm source map leak exposed 512K lines of code, revealing fake tools, frustration regexes, BUDDY virtual pet, KAIROS/ULTRAPLAN modes, and more.

AI Agents 2026-03-31

Agents of Chaos: Red-Teaming Study on AI Agent Security

Research team from Northeastern University and others conducted red-teaming on AI agents, discovering serious vulnerabilities including unauthorized compliance and destructive actions.

AI Agents 2026-03-31

Linear Agent Interaction Guidelines: Design Principles for Agents

6 core design principles for agent-human interaction: identity disclosure, native integration, instant feedback, state transparency, disengagement respect, human accountability.

AI Agents 2026-03-30

AI Agents Could Make Free Software Matter Again

With AI coding assistants, free software may see a renaissance. When AI can read and modify code, source access becomes user capability, not programmer privilege.

AI Agents 2026-03-27

Meta HyperAgents: Self-Referential Self-Improving AI Agents

Meta AI releases HyperAgents, enabling AI agents to autonomously optimize code to complete tasks via self-referential loops.

AI Agents 2026-03-19

Design UI using AI with Stitch from Google Labs

Stitch is evolving into an AI-native platform that allows anyone to create, iterate, and collaborate on high-fidelity UI.

AI Agents 2026-02-18

Google Releases Gemini 3.1 Pro: Next-Gen Multimodal Reasoning Model

Figma introduces Claude Code to Figma, allowing developers to convert code directly into editable designs. In the AI era, the core work of design is finding the best solutions in infinite possibilities.

AI Agents 2026-02-11

Entire: A Collaboration Platform for Agents and Humans

Entire is going beyond repositories, building a developer platform where agents and humans can collaborate, interact, and grow. The birth of a new galaxy draws near.

AI Agents 2026-02-09

AI Hires Humans: A New Paradigm in the Agent Economy

Rent a Human introduces a disruptive concept where AI agents hire humans for physical tasks, marking a fundamental shift in human-machine relationships.

AI Agents 2025-09-16

Google Announced Agent Payments Protocol (AP2)

Google announced AP2, an open protocol built on A2A that enables secure payment transactions between AI agents.

AI Agents 2025-04-09

Google Introduces A2A Protocol: A New Era of Agent Interoperability

Google announces A2A open protocol enabling agents from different frameworks and vendors to collaborate, ushering in a new era of agent interoperability.

AI Agents 2024-11-25

Anthropic Introduces Model Context Protocol (MCP)

Anthropic open-sources MCP, an open standard connecting AI assistants to data systems, solving the isolation between AI and data silos.

Related tags

Top sources

Articles