---
title: "Agent Browsers for OpenClaw"
description: "A quest to integrate browsers in OpenClaw, in an automated and resource-aware way"
updated: 2026-06-02
canonical: https://holaclaw.ai/blog/agent-browsers-for-openclaw
---

An agent is only as useful as the things it can reach, and a browser is how it reaches them: opening a page, filling a form, submitting it, following the redirect, and signing in to the services where your real information actually lives. So configuring a browser for OpenClaw is table stakes; the interesting question is *which* browser, and how to wire it in.

## Picking a browser for OpenClaw

OpenClaw drives a browser through the Chrome DevTools Protocol, and out of the box that means a managed headless Chromium. It works and it's the path OpenClaw is built around, but it's heavy: hundreds of megabytes on disk and 300 MB+ of RAM per browser.

That weight matters for us. At HolaClaw your assistant runs inside its own isolated virtual machine, and keeping each machine lean is exactly what lets us give every user a dedicated environment. The lighter the browser, the more headroom we have. There's also a sharper symptom: a full browser built for humans is more heavyweight, and on a small VM the cold start of a headless Chromium on the very first browse can run long enough to bump up against OpenClaw's tool timeout, so the assistant occasionally gives up before the browser has finished launching.

So we went looking for something lighter and more agent-focused. Enter [Lightpanda](https://lightpanda.io/), a browser built for automation with a tiny footprint (on the order of ~24 MB). On paper, exactly the kind of efficient browser a lean VM wants.

## How

There are several ways to wire Lightpanda into OpenClaw. It's worth saying up front that Lightpanda is a fast-moving target and still in beta, so expect woes and shenanigans in a few places.

### Attempt 1: MCP

We first integrated Lightpanda through MCP. This worked quite well for the reading and summarizing use case. It fell apart on the interactive bit: when you submit a form and get redirected somewhere else, the form was submitted but the browser never followed through to the next page. On top of that, driving the browser through the MCP surface was not ergonomic for many of the actions we needed.

### Attempt 2: Chrome DevTools Protocol (CDP)

Lightpanda also implements the Chrome DevTools Protocol (CDP), the same protocol OpenClaw uses to drive real Chromium. So we pointed OpenClaw's browser tool at a `lightpanda serve` endpoint and tried again.

- **The assistant still couldn't browse.** It would create a new tab, and then on the very next call fail to find the tab it had just created.

- **Lightpanda could log in just fine.** Over a *single, persistent* CDP connection, a full navigate -> fill form -> submit -> follow-redirect flow committed end to end and echoed back our submitted fields. The CDP page events left no doubt the redirect actually committed:

  ```
  Page.frameStartedNavigating   https://httpbin.org/forms/post
  Page.frameScheduledNavigation https://httpbin.org/post
  Page.frameNavigated           https://httpbin.org/post   ← redirect followed
  ```

The reason is a deep mismatch in how the two sides model a browser. As [Lightpanda's own notes call out](https://github.com/lightpanda-io/agent-skill/blob/b5e4ee6394ada2d7d53bd129c7cdfdd76082fff6/SKILL.md#important-notes), `lightpanda serve` gives **each CDP connection its own isolated, ephemeral browser context**: one connection, one context, one page. Concretely:

- page targets are never advertised on the `/json` discovery endpoint (it stays empty),
- they are invisible to any other connection, and
- they cease to exist the moment the connection that created them closes.

A focused test made each of these unambiguous. We created a target on connection #1, then went looking for it from everywhere else:

```
conn#1 creates target FID-0000000001
/json while conn#1 is OPEN      -> []      (never advertised over HTTP)
conn#2 getTargets               -> []      (invisible to other connections)
/json after conn#1 CLOSED       -> []      (gone with the connection)
conn#2 attach to that target    -> error: BrowserContextNotLoaded
```

OpenClaw's browser tool is built around Chrome's model instead: **shared, persistent, discoverable** targets. It expects to list tabs, attach to one, and keep driving it across many independent tool calls: `open`, then `snapshot`, then `type`, then `click`, then `submit`. Each of those is effectively a fresh interaction. With Lightpanda, the tool opens a page on one connection and then, on the next call, there is nothing to discover and nothing to re-attach to. The result is an endless loop of connect, fail to find the tab, disconnect.

The session trajectory shows it at its most absurd: the tool opens a page, Lightpanda mints target `FID-0000000001`, and the very next call to snapshot *that exact id* comes back empty.

```
open     -> Lightpanda mints target FID-0000000001
snapshot targetId="FID-0000000001"  -> tab not found   <- its own freshly-minted id
```

A single long-lived connection with a single page logs in perfectly, but OpenClaw's multi-tab approach is not working as expected.

### What about letting OpenClaw manage Lightpanda itself?

The natural follow-up is: instead of attaching to a shared `serve`, what if OpenClaw launched and owned each Lightpanda instance? It doesn't help, for two reasons. First, OpenClaw's managed launcher emits pure Chrome flags (`--remote-debugging-port`, `--user-data-dir`, `--headless=new`, and friends) and bootstraps a Chrome-style user-data directory, none of which Lightpanda understands. Second, the connection-scoped, non-discoverable behavior is intrinsic to `lightpanda serve`; who starts the process changes nothing about it.

The only way Lightpanda fits here is if OpenClaw grows a **Lightpanda-native transport**: one persistent connection per browser session, tracking its own target ids, never relying on `/json` discovery. Our single-connection test proves logins work that way, but that's upstream OpenClaw work, not a config toggle on our side.

### What about leaving browsing to a skill?

There's one more option that sidesteps the transport mismatch entirely: don't route Lightpanda through OpenClaw's browser tool at all, and instead hand the whole browsing activity to a [skill](https://github.com/lightpanda-io/agent-skill). A skill drives Lightpanda on its own terms, with its own persistent connection and its own target tracking, so the connection-scoped model that broke the tool integration is no longer a problem.

The catch is that this introduces *two* ways to browse, and which one wins depends on context. As long as the skill is loaded in context, the assistant will prefer it: the skill's description is what tells the model "use this for browsing." But if the skill isn't in context, there's nothing steering the assistant toward it, and it can fall back to the original browser tool, straight back into the connection-scoped failure mode we just described. So a skill is a real option, but it trades a transport problem for a routing one: you have to keep the skill reliably in context to keep browsing on the path that actually works.

## What's next?

For now, the pragmatic path is a managed headless Chromium under OpenClaw's normal model: the browser OpenClaw is actually built to drive, launched and owned by OpenClaw itself. That brings back the original problem that sent us to Lightpanda in the first place, the cold-start timeout on a small VM, so the plan is to address it head-on by pre-warming the browser and giving the first browse more room before it times out.

We're keeping a close eye on Lightpanda. Its footprint is genuinely compelling for the kind of isolated, resource-aware environment HolaClaw runs in, and the moment a Lightpanda-native CDP transport lands upstream, it goes right back on the table.

A few lessons we're taking with us:

- **Reproduce at the lowest useful layer.** Driving raw CDP directly is what separated "Lightpanda can't submit forms" (the issue we faced when using MCP) from "OpenClaw can't keep a tab" (the real issue we are facing with OpenClaw model when integrated with Lightpanda).
- **A compatible protocol is not a compatible model.** Lightpanda speaks CDP, but the object model behind it, connection-scoped versus shared targets, is where the integration actually lived or died.