Latest Posts

The missing transport layer in user-facing AI applications

Mar 19, 2026 By Amber Dawson In Ably

Most AI applications start the same way: wire up an LLM, stream tokens to the browser, ship. That works for simple request-response. It breaks when sessions outlast a connection, when users switch devices, or when an agent needs to hand off to a human. The cracks appear in the delivery layer, not the model. Every serious production team discovers this independently and builds their own workaround. Those workarounds don't hold once users start hitting them in production.

Read Post

Ably

Read more about The missing transport layer in user-facing AI applications

Resume tokens and last-event IDs for LLM streaming: How they work & what they cost to build

Mar 12, 2026 By Amber Dawson In Ably

When an AI response reaches token 150 and the connection drops, most implementations have one answer: start over. The user re-prompts, you pay for the same tokens twice, and the experience breaks. Resume tokens and last-event IDs are the mechanism that prevents this. They make streams addressable – every message gets an identifier, clients track their position, and reconnections pick up from exactly where they left off. The concept is straightforward.

Read Post

Ably

Read more about Resume tokens and last-event IDs for LLM streaming: How they work & what they cost to build

Why AI agents need a transport layer: Solving the realtime sync problem

Mar 9, 2026 By Amber Dawson In Ably

Building AI agents that work reliably in production requires solving problems that have nothing to do with AI. While teams focus on prompt engineering, model selection, and agent orchestration, a different class of challenges emerges at deployment. These have little to do with LLMs and everything to do with keeping agents and clients synchronized in realtime. Over the past few months, we've spoken with engineers at over 40 companies building AI assistants, copilots, and agentic workflows.

Read Post

Ably

Read more about Why AI agents need a transport layer: Solving the realtime sync problem

Why your AI response restarts on page refresh (and what it takes to prevent it)

Mar 6, 2026 By Amber Dawson In Ably

Your AI assistant is mid-sentence explaining a complex debugging strategy. The user refreshes the page. The response starts over from the beginning, or worse, vanishes entirely. This isn't a model problem. It's a delivery problem.

Read Post

Ably

Read more about Why your AI response restarts on page refresh (and what it takes to prevent it)

WebSockets vs HTTP for AI applications: which to choose in 2026

Feb 15, 2026 By Faye McClenahan In Ably

When building AI experiences, choosing between WebSockets and HTTP isn't always straightforward. Which protocol is better for streaming LLM responses? How do you maintain continuity when users switch devices mid-conversation? Should you use both? The answer depends on the type of AI experience you're building. Modern AI applications often require both protocols, each serving different purposes. The key question is: how do you decide which communication pattern fits each scenario in your AI stack?

Read Post

Ably

Read more about WebSockets vs HTTP for AI applications: which to choose in 2026

Edit and delete messages without rewriting your history layer

Feb 13, 2026 By Faye McClenahan In Ably

Editing or removing a message after it’s been published sounds simple. In realtime systems, it usually isn’t. Once a message has been delivered to multiple clients, cached locally, and written into history, changing it safely becomes a coordination problem. Clients need to agree on what’s current. History needs to stay consistent. Reconnects and refreshes can’t bring back stale content. That’s why many systems treat messages as immutable by default.

Read Post

Ably

Read more about Edit and delete messages without rewriting your history layer

Appends for AI apps: Stream into a single message with Ably AI Transport

Feb 10, 2026 By Faye McClenahan In Ably

Streaming tokens is easy. Resuming cleanly is not. A user refreshes mid-response, another client joins late, a mobile connection drops for 10 seconds, and suddenly your “one answer” is 600 tiny messages that your UI has to stitch back together. Message history turns into fragments. You start building a side store just to reconstruct “the response so far”. This is not a model problem. It’s a delivery problem That’s why we developed message appends for Ably AI Transport.

Read Post

Ably

Read more about Appends for AI apps: Stream into a single message with Ably AI Transport

Why orchestrators become a bottleneck in multi-agent AI

Jan 29, 2026 By Faye McClenahan In Ably

Complex user tasks often need multiple AI agents working together, not just a single assistant. That’s what agent collaboration enables. Each agent has its own specialism - planning, fetching, checking, summarising - and they work in tandem to get the job done. The experience feels intelligent and joined-up, not monolithic or linear. But making that work means more than prompt chaining or orchestration logic.

Read Post

Ably

Read more about Why orchestrators become a bottleneck in multi-agent AI

Multi-agent AI systems need infrastructure that can keep up

Jan 22, 2026 By Amber Dawson In Ably

When you're building agentic AI applications with multiple agents working together, the infrastructure challenges show up fast. Agents need to coordinate, users need visibility into what's happening, and the whole system needs to stay responsive even as tasks branch out across specialised workers. We built a multi-agent travel planning system to understand these problems better. What we learned applies well beyond holiday booking.

Read Post

Ably

Read more about Multi-agent AI systems need infrastructure that can keep up

Realtime steering: interrupt, barge-in, redirect, and guide the AI

Jan 21, 2026 By Faye McClenahan In Ably

Start typing, change your mind, redirect the AI mid-response. It just works. That is the promise of realtime steering. Users expect to interrupt an answer, correct its direction, or inject new instructions on the fly without losing context or restarting the session. It feels simple, but delivering it requires low-latency control signals, reliable cancellation, and shared conversational state that survives disconnects and device switches.

Read Post

Ably

Read more about Realtime steering: interrupt, barge-in, redirect, and guide the AI

Systems | Development | Analytics | API | Testing

The missing transport layer in user-facing AI applications

Resume tokens and last-event IDs for LLM streaming: How they work & what they cost to build

Why AI agents need a transport layer: Solving the realtime sync problem

Why your AI response restarts on page refresh (and what it takes to prevent it)

WebSockets vs HTTP for AI applications: which to choose in 2026

Edit and delete messages without rewriting your history layer

Appends for AI apps: Stream into a single message with Ably AI Transport

Why orchestrators become a bottleneck in multi-agent AI

Multi-agent AI systems need infrastructure that can keep up

Realtime steering: interrupt, barge-in, redirect, and guide the AI

Monthly Archive

Follow Us