Why Cloudflare is the Perfect Infrastructure for Building AI Applications
2025-04-14
NKNiko Korvenlaita

Why Cloudflare is the Perfect Infrastructure for Building AI Applications

A Love Letter to Cloudflare -

You know that feeling when you find the perfect tool for a job? That's how I feel about Cloudflare right now. If you've been following along, I recently wrote about the challenges of implementing a remote MCP server because of its stateful nature. Today, I want to share why Cloudflare has become my infrastructure provider of choice for the AI age.

From DNS Provider to Edge Computing Pioneer

Most people know Cloudflare as a DNS provider with some extra services like DDoS protection. That's how I got started with them too. But everything changed when they launched Cloudflare Workers - a serverless execution runtime similar to AWS Lambda, but with a crucial difference: Cloudflare Workers uses lightweight V8 isolates deployed globally, resulting in significantly faster cold starts and lower latency than Lambda.

When I first tried Workers, I was immediately excited by their implementation. Unlike AWS Lambda, there was virtually zero cold start time. You could run Workers in front of your actual servers, handling certain logic at the edge or passing requests through to your backend.

This website runs on Workers, our web app runs on Workers, and our core APIs run on Workers. I'm a huge fan.

But the real magic came later.

Durable Objects: A Beautiful Primitive

Cloudflare has continuously improved their worker product expanding to state and storage in addition of just compute, first adding a Key-Value store for persistence. But their most brilliant addition has been Durable Objects.

So what exactly are Durable Objects? Essentially, you implement a JavaScript class extending from the DurableObject base class. You can't call these objects directly from outside the Cloudflare ecosystem - you have to call them from a Cloudflare Worker context. When you initialize the class, you do it with an ID, and the worker code gets a stub to the durable object.

Here's the beautiful part: there is globally one instance of a durable object identified by its ID. If you connect with that ID, you'll always connect to the same instance running in a single-threaded JavaScript environment.

This solves one of the hardest problems in distributed systems: maintaining global state without complicated locking mechanisms or consensus algorithms. Cloudflare handles this for you.

Not only that, but you can also persist state. 😱

When nothing's happening, the durable object goes to sleep. You're only billed when it's actually executing something. When it wakes up, you can load state back from persistent storage. It's such an elegant primitive that when Cloudflare launched it, I immediately saw the possibilities - even before the rise of agentic workflows, LLMs, and MCP.

Real-World Applications of Durable Objects

Let me share some practical examples of how we've used Durable Objects at reconfigured:

OAuth Management

When building reconfigured v1, we used Durable Objects to manage GitHub integrations. If you've dealt with OAuth, you know the pain of handling access tokens, refresh tokens, and the periodic refresh dance.

Each customer's GitHub integration was a single Durable Object, which nicely encapsulated all the authentication logic. When pulling from or pushing to git, we made calls to that particular object, which ensured their access token was always fresh. There were no race conditions during refreshes.

Some OAuth providers don't just refresh the access token but also refresh the refresh token itself. Handling that state and avoiding race conditions isn't trivial - I've done it multiple times and it's always annoying. With Durable Objects, we could isolate that logic and use the native locking from Durable Objects to ensure only one refresh call happened at a time, while other API calls waited for the refresh to complete.

Per-Tenant Database Management

In our current reconfigured application, we use Durable Objects to route and handle requests related to journaling data. Since we use per-tenant databases, that are also synced to users end users own machines, we need to provision and rotate access tokens frequently. We have one Durable Object per organization for handling the database authentication, with the object ID derived from the organization ID.

And because we can't just bombard our db provider constantly to provision new tokens, by caching the per tenant db token on the per tenant Durable Object allows us to reuse the the token for multiple calls before rotating it. This makes it super easy for us to do db calls from stateless worker code, or from other stateful Durable Objects.

Chat Implementation

Each AI-chat in our application is also a Durable Object. Cloudflare initially offered simple KV storage within Durable Objects, but they recently added support for SQLite, making it incredibly easy to model and query data that's scoped to that specific context.

For chats, we have a chat history table with all messages for that specific chat within its Durable Object. This isolation is perfect - the chat history is only needed on that instance. No need to think about scaling challenges as the data is partitioned on the Durable Object level making actions on the chat history blazingly fast.

Perfect Pricing for AI Workloads

Cloudflare's pricing model works beautifully for AI applications. You're only billed for the CPU time you use, not for waiting time. In LLM-based applications, much of the time is spent waiting for the LLM to respond and streaming that response back to the client. All that wait time isn't billed, making it perfectly priced for this era.

The Agents SDK: Building on Primitives

Cloudflare isn't just providing primitives - they're building higher-level abstractions too! They've shipped an agents SDK npm package with base classes for AI chat and Agentic LLM stuff, extending from the Durable Object primitive.

Implementing an LLM-based chat with a Durable Object using their agents SDK is incredibly straightforward. They're building on top of Vercel's AI SDK library for client-side functionality to provide hooks like useAgentChat, so when you need to connect to the chat from a React application, you just import the hooks, give it the correct ID, and pass the authentication details and you're rocking! Just need to implement the UI side to make the app look like your own.

Within the Durable Object, all the boilerplate is handled by the agents SDK package. You simply implement a method called onChatMessage that get's called when user sends new chat message and then you can do whatever you want - pass it to an LLM and stream back the response, or implement more complex logic.

The Perfect Primitive for MCP

Remember the stateful challenges of MCP I mentioned in my previous post? Durable Objects solve this elegantly.

For MCP, you need a stateful application identified by a session ID. With Durable Objects, every time you connect to that session, you're connecting to the same object that can persist state between invocations. And since it's serverless, you're only billed based on CPU usage.

This is perfect for MCP, which might be idling for long periods between user actions and agent calls. When an agent calls the MCP, the object wakes up quickly, does what's needed, responds, and hibernates again.

Cloudflare's agents SDK even includes base class McpAgent for implementing the MCP protocol. While we at reconfigured don't have an MCP server yet, we're working on it, and Cloudflare's primitives make it remarkably straightforward.

The Complete Infrastructure Package

Beyond Durable Objects, Cloudflare offers a comprehensive infrastructure solution. Here's mentioning just few we're actively using:

Why Cloudflare Makes Sense in the AI Age

Traditional infrastructure setups like Lambda, Kubernetes, Google Cloud Run, or Amazon ECR were designed for stateless workflows. They make implementing stateful protocols like MCP challenging.

Cloudflare's offering, with primitives like Durable Objects, is perfectly aligned with where computing is headed in the AI age. The state and execution are nicely isolated, scaling well with users and customers (unless you misuse it by pushing everything into a single Durable Object - the number of objects you can have is unlimited).

This isn't a paid advertisement - I'm just a happy customer sharing why I love Cloudflare and how it makes building AI applications easier, especially when implementing remote MCP servers. The combination of Cloudflare Workers, Durable Objects, and their agents SDK simplifies what would otherwise be a complex architecture.

If you want to chat, hit me up on LinkedIn, either via DMs, or directly on the feed 😊