Provision Postgres at Scale with the Neon API
Embed Postgres in your platform or agent, with instant provisioning, scale-to-zero, and API-enforced usage limits

Neon offers the most powerful Postgres API on the market for applications that need to provision and manage thousands of databases programmatically. Whether you’re building a platform (like Retool) or deploying infrastructure through AI agents (like Replit), Neon gives you the tools to do it efficiently and at scale.
Why is that? It starts with Neon’s serverless architecture, which enables instant provisioning and scale-to-zero—making it a natural fit for these use cases. Building on this advantage (and informed by early partnerships with products like Vercel Postgres and RetoolDB) we’ve gained deep experience with embedded Postgres workflows. We’ve since consolidated that experience by powering fully autonomous database provisioning in platforms like Replit Agent and Create.xyz, where AI agents spin up thousands of databases daily using Neon.
From there, we’ve continued to evolve our API to offer deeper levels of control. The result is an API specialized for managing fleets of Postgres databases at scale.
The Use Cases Driving API-First Postgres
The Neon API is useful for any developer, but when it comes to managing Postgres fleets at scale, two use cases stand out where it’s a perfect fit:
- Platforms with embedded Postgres. These are developer platforms that let users spin up dedicated Postgres databases as part of their product experience (e.g. Vercel, Koyeb, Genezio).
- AI agents. These are AI tools like Replit Agent or Create.xyz, which are able to dynamically deploy Postgres databases while building apps on behalf of users. These agents generate thousands of databases daily as they build and run applications, relying on Neon to provision and manage each one behind the scenes.
are you building something similar?
How the Neon API Helps You Manage Thousands of Postgres DBs
For platforms and agents deploying thousands of isolated databases, the Neon API offers fine-grained control over provisioning, scaling, and usage. Paired with Neon’s instant provisioning and scale-to-zero capabilities, it gives teams the flexibility to manage large fleets of Postgres databases programmatically, while staying cost-efficient.
In most embedded Postgres workflows, one Neon project maps to one Postgres database. That means managing a fleet of databases at scale is really about managing thousands of Neon projects—the features we’ve progressively added to the Neon API, as we work with parrners, make this possible even with small teams.
true story
Enforce consumption quotas per project
With the Neon API, you can set hard limits on resource consumption for each project—ensuring cost control and predictable resource allocation.
You can define:.
- Set max compute uptime allowed per billing cycle (via
active_time_seconds
) - Set max CPU seconds allowed across all computes (via
compute_time_seconds
) - Set max amount of data written for the month (via
written_data_bytes
) - Set max storage per branch (via
logical_size_bytes
)
Why this matters: This not only prevents unexpected resource consumption but allows better control over multi-tiered pricing models. For example, you can set quotas per plan.
Control compute settings
The API also lets you define precise autoscaling behavior for each project:
- Set the minimum vCPU size (
autoscaling_limit_min_cu
) - Set the maximum vCPU size (
autoscaling_limit_max_cu
) - Define the timeout before compute suspends (
suspend_timeout_seconds
)
Why this matters: You can tune performance and efficiency per tier. Free-tier projects can scale down aggressively, while enterprise-tier ones stay warm with higher limits.
Track consumption across your database fleet
You also have visibility into usage across thousands of projects:
- Total compute uptime (
active_time_seconds
) - Total CPU time used (
compute_time_seconds
) - Total data written (
written_data_bytes
) - Total outbound data transferred (
data_transfer_bytes
)
Why this matters: You can monitor usage in real time, alert users when they approach limits, and implement billing based on actual consumption.
Interfaces for AI agents
Neon also offers dedicated interfaces for AI agents that need to spin up and manage Postgres databases in real time. These interfaces are built to support agentic workflows where infrastructure is provisioned on the fly and managed autonomously (e.g. like Replit Agent or Create.xyz).
Neon Model Context Protocol (MCP) Server
The Model Context Protocol (MCP) is an emerging standard that enables large language models and AI agents to interact with APIs and developer tools using structured, natural-language commands. Neon’s MCP server enables agents to perform tasks like
- Creating and deleting Postgres databases
- Running SQL queries
- Managing branches and migrations
Originally, Neon’s MCP server was available for local use. Now, with the launch of the remote MCP server, developers no longer need to install or configure anything locally. The hosted version luses OAuth 2.1 for secure authentication, allowing agents to act on behalf of users without requiring API keys.
@neondatabase/toolkit
We also have a lightweight, agent-friendly toolkit for provisioning and querying Neon databases. It includes:
- The Neon TypeScript SDK, for programmatic access to Neon resources.
- The Neon Serverless Driver, which supports SQL queries over HTTP and works in edge environments like Cloudflare Workers or Vercel Functions.
Wrap Up
If you’re building a platform or AI agent that needs to provision and manage databases at scale, Neon is exactly what you’re looking for. Reach out to us to get more information or credits for a PoC.