aws.amazon.com web signal

AWS OpenSearch Serverless Scales to Zero, Cuts Costs 60%

amazon agents ai infrastructure ai-infrastructure agents vector-search

Key insights

  • The new OpenSearch Serverless scales to zero when idle, delivering up to 60% cost savings over always-on provisioned clusters.
  • OpenSearch Agent Skills bring native vector search into Kiro, Cursor, and Claude Code without requiring infrastructure configuration.
  • The rebuilt service creates collections in seconds and handles capacity spikes 20x faster than the previous generation.

Why this matters

Retrieval infrastructure has been a persistent ops tax on agentic applications, requiring teams to pre-provision capacity for unpredictable query volumes. AWS is now making vector search disposable and on-demand, which removes a key friction point from the agent development loop and positions OpenSearch as a default rather than a deliberate architectural choice. The native integrations with Claude Code and Cursor mean the retrieval layer becomes invisible to developers building inside those environments, which could shape vector database adoption patterns across the agentic stack for years.

Summary

AWS rebuilt Amazon OpenSearch Serverless for agentic workloads on May 28, delivering scales-to-zero capacity and up to 60% cost reduction versus provisioned clusters. Collections now spin up in seconds with 20x faster burst capacity than the prior generation, designed for the spike-heavy retrieval patterns multi-agent systems produce. The launch also ships OpenSearch Agent Skills, composable integrations that bring vector search natively into Kiro, Cursor, and Claude Code. Essentially: (AWS, Anthropic, Cursor) retrieval infrastructure is now addressable directly from agent tooling. - Scales to zero when idle, eliminating standby costs for low-traffic or experimental agent workloads. - 20x faster burst handling addresses the unpredictable traffic spikes common in RAG and multi-agent pipelines. - Generally available today across all AWS commercial regions. AWS is staking out the retrieval layer of the agentic stack before tooling defaults consolidate.

Potential risks and opportunities

Risks

  • Teams migrating from provisioned OpenSearch clusters to the new serverless generation risk undocumented behavioral differences in query performance at scale, with no migration SLA published at launch
  • Competitors including Pinecone, Weaviate, and Qdrant could accelerate their own native IDE integrations within 60-90 days, narrowing the tooling advantage AWS holds today
  • Developers adopting OpenSearch Agent Skills inside Cursor or Claude Code create a tighter AWS dependency in their agentic stack, increasing lock-in exposure if pricing changes post-GA

Opportunities

  • Vector database vendors (Pinecone, Weaviate, Qdrant) face pressure to ship comparable native integrations with Cursor and Claude Code before developer habits form around OpenSearch Agent Skills
  • AWS partners building production agentic applications on Kiro and Cursor gain a differentiating capability without requiring separate vector infrastructure procurement or configuration overhead
  • Enterprises already standardized on AWS infrastructure can accelerate RAG and multi-agent pilots by removing vector search provisioning from the critical path, shortening time-to-prototype by days

What we don't know yet

  • Pricing model for scales-to-zero billing not yet published; cold-start latency when scaling from zero is unspecified at launch
  • Whether OpenSearch Agent Skills support agent frameworks beyond the three named integrations, including self-hosted or on-premises deployments
  • The 60% cost reduction figure's baseline is undefined: cluster size, workload profile, and query volume used for the comparison are not disclosed