OpenClaw Lobster YAML Workflows: Build Pipelines That Don’t Break

Lobster turns multi-step automations into single tool calls. Learn the YAML syntax, approval gates, and resume tokens that make your workflows actually work.

Jack Tom2026-04-139 min readIntermediate

You ask your AI agent to fetch data, clean it, generate a report, and email it. The agent makes four separate tool calls. Halfway through, the LLM decides to “optimize” the process and skips the cleaning step. Your report is garbage.

Same task, different approach: Hand the agent a single workflow file. All four steps run in order. It pauses before sending the email so you can review. When you approve, it resumes exactly where it left off. No improvisation. No token waste re-planning each step.

That’s Lobster. OpenClaw’s workflow engine turns multi-step sequences into typed pipelines – YAML files that execute deterministically, pause for human approval, and resume with a token. Prompt-orchestrated workflows fail halfway through production runs. Lobster workflows finish.

Why LLM orchestration breaks in production

The LLM decides the control flow when you describe a multi-step task in a prompt. It calls tool A, reads the output, chooses whether to call tool B or skip to C, maybe retries on error, maybe doesn’t. Every step costs tokens. The agent re-evaluates the plan each time.

Complex workflows can burn hundreds of tokens per step just on orchestration logic (per OpenClaw’s docs). More important: LLMs are unreliable routers. A community developer who built a code review pipeline noted, “Every time I tried to put flow control in a prompt (‘when you’re done, send to the reviewer’), I introduced a failure mode.”

Lobster moves orchestration out of the LLM and into a runtime. Steps execute sequentially. Data flows as JSON between them. The LLM does creative work (writing code, analyzing text); the workflow handles plumbing (sequencing, counting, routing, retrying). GitHub Actions uses this pattern for CI/CD. Now it’s for AI agents.

The YAML syntax that actually matters

A Lobster workflow is a .lobster file (YAML or JSON) with four required fields: name, args, steps, plus optional approval or condition gates. This structure covers 80% of real use cases:

name: data-report
args:
 source:
 default: "prod-db"
steps:
 - id: fetch
 run: data-fetch --source ${source} --json

 - id: clean
 run: data-clean --json
 stdin: $fetch.stdout

 - id: generate
 run: report-gen --template weekly
 stdin: $clean.json

 - id: review
 approval: "Send this report to the team?"
 stdin: $generate.stdout

 - id: send
 run: emailer send --to [email protected]
 stdin: $generate.stdout
 condition: $review.approved

id + run (shell execution) for each step. stdin pipes output from a previous step: $fetch.stdout for raw text, $fetch.json for parsed JSON. approval pauses execution and returns a resume token. condition skips steps based on prior results.

Workflows execute inside the OpenClaw gateway process using an embedded runner (as of January 2026, per the Lobster repository). No subprocess is spawned. Errors appear in gateway logs, not stderr. This is recent – older community examples reference a standalone CLI with a different error surface.

Think of YAML as the assembly language for your agent’s routines. You’re not telling the LLM how to orchestrate – you’re compiling the orchestration into a typed artifact.

Arguments: ${arg} vs environment variables

Pass arguments to workflows via --args-json. Inside the YAML, reference them with ${arg} for simple substitution or LOBSTER_ARG_<NAME> environment variables for anything complex.

The catch: ${arg} is a raw string replace. Arguments containing quotes, dollar signs, backticks, or newlines break the shell command. The official docs bury this: “For anything that may contain quotes, $, backticks, or newlines, prefer env vars.”

Safe pattern:

args:
 message:
 default: "Hello, world"
steps:
 - id: send
 run: |
 echo "Message: $LOBSTER_ARG_MESSAGE"

The full args object is also available as LOBSTER_ARGS_JSON if you need to parse it yourself.

Approval gates and resume tokens

approval halts execution and returns a JSON envelope with "status": "needs_approval" and a resumeToken. You inspect the prompt, decide, and resume:

{
 "action": "resume",
 "token": "<resumeToken>",
 "approve": true
}

Lobster stores resume state under its state directory and hands back a compact token key (per docs). Move or delete that state dir between pause and resume? Token becomes invalid. The error message won’t tell you why – just “invalid token.” This is a cross-reference gap: the token mechanism is documented, but the state dir requirement isn’t mentioned in the resume section.

Pro tip: Use approve --preview-from-stdin --limit N to attach a JSON preview to approval requests without writing custom jq glue. Shows you the first N items from the piped data so you can decide without inspecting raw output.

Production gotchas nobody warns you about

Issue	Default	What breaks	Fix
Timeout	20000ms (20s)	Long pipelines silently fail halfway through	Set `timeoutMs: 60000` or higher in tool call
Stdout limit	512000 bytes	Large JSON outputs get killed mid-stream	Raise `maxStdoutBytes` or paginate data
Invalid JSON	Pipeline prints debug text	Parser chokes on non-JSON stdout	Run in tool mode (`--mode tool`) and print only JSON
${arg} escaping	Raw string replace	Arguments with special chars break shell	Use `LOBSTER_ARG_<NAME>` env vars

Default timeout: 20 seconds. Docs list it in the error reference (“lobster timed out → increase timeoutMs”) but not in the quickstart. Your workflow does anything slow – API calls, file processing, database queries – you’ll hit 20 seconds and get a cryptic failure.

Hmm. You’d think a workflow engine would default to 60+ seconds for multi-step operations. But here we are.

Multi-agent pipelines with loops

Sub-workflows with loop support didn’t exist until March 2026. A developer contributing to OpenClaw documented building a code → review → test pipeline where one agent writes code, another reviews it, and a third tests it. The loop checks $LOBSTER_LOOP_JSON.approved – if false and iteration < 3, it goes back to step 2.

129 lines of code + 186 lines of tests. Now part of Lobster. Many tutorials predate this feature. If you need iteration, you’re working with recent functionality.

name: code-review-loop
steps:
 - id: code
 run: agent-send --agent programmer --task "${task}"

 - id: review
 run: agent-send --agent reviewer --code "$code.stdout"

 - id: parse-review
 run: openclaw.invoke --tool llm-task --action json --args-json '{"prompt":"Parse review","schema":{"approved":"boolean"}}'
 stdin: $review.stdout

 - id: loop-check
 condition: $parse-review.json.approved == false && $LOBSTER_LOOP_COUNT < 3
 run: echo "Retry"
 loop: true

Deterministic multi-agent coordination. LLMs do creative work. YAML handles control flow.

Session keys as data model

The same developer noted a useful pattern: session keys like pipeline:<project>:<role> give you project isolation, role separation, and addressability in one string. No database needed – the session key is the address. OpenClaw already has agentToAgent, sessions_send, and webhooks with session routing. Combine those with Lobster and you have a message bus.

When to use Lobster vs prompt orchestration

Prompt orchestration still wins for exploratory tasks where you don’t know the steps ahead of time. Lobster wins when:

You’ve run the workflow manually 3+ times and the steps are stable
Failures mid-execution are expensive (data loss, partial state, duplicate actions)
You need human approval before side effects (sending emails, posting to APIs, deleting data)
Token cost matters – you’re running the workflow frequently or at scale
You need audit logs (workflows are data; you can diff, version, and replay them)

The ClawFlows announcement in February 2026 introduced natural language pipeline generation: you describe the workflow in plain English, an agent writes the YAML, creates a pull request, and you review before deploying. Closes the gap between “I don’t know YAML” and “I need a deterministic pipeline.”

Enable Lobster and run your first workflow

Lobster is an optional plugin, disabled by default. Add it to your ~/.openclaw/openclaw.json:

{
 "plugins": {
 "entries": {
 "lobster": {
 "enabled": true
 }
 }
 },
 "agents": {
 "list": [
 {
 "id": "main",
 "tools": {
 "allow": ["lobster"]
 }
 }
 ]
 }
}

Save a workflow to ~/workflows/test.lobster:

name: hello
steps:
 - id: greet
 run: echo '{"message": "Hello from Lobster"}'

Call it from your agent:

"Run the workflow at ~/workflows/test.lobster"

OpenClaw executes it in-process and returns the JSON envelope. "status": "ok"? It worked.

What to build next

Start with a workflow you’re already doing manually. Common first projects:

Daily standup report: Fetch GitHub activity, format it, post to Slack – with approval before posting
Inbox triage: List emails, categorize them with an LLM, apply labels – pause before applying
Data sync: Pull from API A, transform, push to API B – with validation step in between

The public “second brain” example chains commands like inbox list --json, inbox categorize --json, and inbox apply --json into workflows with approval gates. CLI emits JSON, Lobster pipes it, approval gate stops it before side effects. That’s the pattern.

FAQ

Can I call other OpenClaw tools from a Lobster workflow?

Yes. openclaw.invoke --tool <name> --action <action> --args-json '{...}' in a step’s run command. Done.

What happens if a step in the middle fails?

Workflow halts and returns "status": "failed" with error details. No resume token – failed workflows stop, they don’t pause. If you need retry logic, wrap the failing command in a script that retries N times and exits 0 on success. Or use a condition field to skip subsequent steps when a prior step’s output indicates failure. One developer I talked to handles this with a wrapper script that does exponential backoff for API calls – 3 retries, then fail. Lobster doesn’t have built-in retry primitives; you handle that in the steps themselves.

How do I debug a workflow that returns invalid JSON?

Run the same pipeline in a terminal to inspect stderr: lobster run --file workflow.lobster (if you have the standalone CLI) or check the OpenClaw gateway logs if you’re using the embedded runner. Most common cause: a step prints debug text to stdout instead of just JSON. Make sure every step that pipes to the next one outputs valid JSON and nothing else. Use --mode tool when calling the workflow to enforce JSON-only output, or redirect debug logs to stderr in your scripts. I’ve lost an hour to a stray “Processing…” print statement more times than I’d like to admit.