Blog

Writing about AI, agents, and the experience of building with them.

autoresearchkarpathy-loopopenclawgovernanceself-improving-agentsdevops

AutoResearch: How We Use AI to Improve Our Own Platform

We built an autonomous experimentation pipeline that runs real experiments on our B2B dashboard — the Karpathy Loop with governance, cost tracking, and knowledge accumulation. Here's how it works and what we learned.

Apr 5, 20265 min readAI-assisted$2.16 AI cost
dev-teampipelinelive-builddeployautomation

Live Build — Ship Code to Production in 3 Minutes

Watch Claude Code write real code in real-time, auto-deploy it to production, and verify it with screenshots — all from a single plain-English description. Here's how the 14-stage pipeline works and what the numbers actually look like.

Mar 31, 20265 min readAI-assisted$0.04 AI cost
Technical Overviewretrospectivetestingproductionbugsengineering

The Week We Stopped Building Features

74 commits, 30 fixes, 4,700 new tests. After a marathon sprint, we spent three days doing nothing but making the platform actually work.

Mar 28, 20269 min readAI-assisted$0.12 AI cost
Architecture Deep Diveretrospectiveshippingplatformmarathon

Marathon Session: 4 Sites, 50+ Features, One Weekend

A full accounting of what got built, broken, and fixed in a single continuous session — from per-org GitHub config to a working AI chat widget.

Mar 25, 202612 min readAI-assisted$0.14 AI cost
Architecture Deep Divesecurityopenclawcontainersgovernance

Why AI Agents Need a Locked Room

135,000 exposed instances. 63% vulnerable. AI agents are powerful — and that's exactly why they need containers, governance, and an audit trail.

Mar 25, 20269 min readAI-assisted$0.28 AI cost
Architecture Deep Divedev-teampipelineclaude-codeautomationdeployci-cddocker

From GitHub Issue to Production Deploy in 3 Minutes — Building an AI Dev Team Pipeline

How we built a fully automated pipeline where a GitHub issue triggers AI decomposition, Docker containers running Claude Code Opus write the code, automated review catches bugs, and a configurable deploy pipeline pushes to production. Real numbers: PR #1235 in 179 seconds, $0.22 total cost.

Mar 24, 20269 min readAI-assisted$0.22 AI cost
Architecture Deep Diveretrospectivedev-teammulti-repopipelineclaude-codeautomationmarginmandylessons-learned

Week in Review: The AI Dev Team Goes Multi-Repo

An honest engineering retrospective covering March 19-24, 2026 — 167 commits across 3 repos, 14 merged PRs on a brand-new project, per-org GitHub config, a configurable deploy pipeline, 12 pluggable worker engines, and every bug that bit us along the way. LLM cost: $0.

Mar 24, 202613 min readAI-assisted$0.13 AI cost
Technical Overviewdogfoodingdashboardgovernancecost-trackingdevopscontinuous-improvement

Dogfooding: How We Use Our Own Dashboard to Ship Better

We build a governance platform for AI agents. We also run AI agents. Here's how using our own dashboard — cost attribution, error tracking, evaluation framework — makes our development process concretely better.

Mar 19, 20267 min readAI-assisted$0.28 AI cost
Architecture Deep Diveretrospectiveclaude-codeengineeringai-pair-programminglessons-learned

Sprint Retro: What Claude Code and I Actually Shipped (And What Broke)

A real engineering retrospective on 10 days of human+AI pair programming — 319 commits, 3 production bugs, 1 billing error that overcharged 100x, and what we're doing differently next sprint.

Mar 19, 202611 min readAI-assisted$0.34 AI cost
Technical Overviewmodelscost-trackinggovernancestepfundebugging

Why We Switched from MiniMax to StepFun (And Found a 100x Billing Bug)

A model swap that should have been a one-line change turned into a 46-file investigation, uncovered a critical billing bug, and taught us why model governance needs a single source of truth.

Mar 19, 20266 min readAI-assisted$0.22 AI cost
Technical Overviewgatewayarchitecturegovernancecurate-meopenclaw

How I Built an AI Gateway for OpenClaw

A technical deep-dive on building a governance gateway that proxies 51 LLM providers, enforces cost caps, scans for PII, and tracks every token — all without changing a single line of application code.

Mar 17, 20269 min readAI-assisted$0.14 AI cost
For Leaderscostsoptimizationagentscurate-mecase-study

How We Cut AI Agent Costs by 85%

A cost optimization case study: swapping models, adding semantic caching, and migrating to DeepSeek R1 took our multi-agent pipeline from $0.12 per analysis to under $0.02 — with no quality loss.

Mar 17, 20268 min readAI-assisted$0.12 AI cost
For Leaderssecurityagentsidentitygovernanceenterprise

Agent Identity: The Enterprise Blind Spot

Only 21.9% of organizations treat AI agents as independent identities. The rest have a compliance gap that regulators will notice.

Mar 13, 20265 min readAI-assisted$0.16 AI cost
Architecture Deep Diveagentsmcpa2astandardsopenclaw

AGENTS.md, MCP, and A2A: The Three Standards Every Agent Team Needs

The protocol landscape for AI agents has crystallized. Here's what each standard does, how they compose together, and what your team should adopt now.

Mar 13, 20266 min readAI-assisted$0.18 AI cost
Technical Overviewagentsarchitecturemulti-agentcurate-me

From Single Agent to Fleet: What Changes

Moving from one AI agent to nine required rethinking coordination, cost control, and observability. Here's what we learned.

Mar 13, 20265 min readAI-assisted$0.13 AI cost
For Leadersagentshitlgovernancepatterns

Human-in-the-Loop Patterns for AI Agents

Five human approval patterns we use — from simple approve/reject to iterative feedback loops. How to keep humans in control without slowing agents down.

Mar 13, 20265 min readAI-assisted$0.14 AI cost
Architecture Deep Diveagentsarchitecturethin-clientperplexity

Perplexity Computer and the Thin Client Pattern

Perplexity's multi-model agent orchestrator validates the thin client approach we've been building. Here's what we learned — and what we shipped.

Mar 13, 20268 min readAI-assisted$0.15 AI cost
Technical Overviewobservabilityagentsmonitoringgovernance

The Agent Observability Gap

89% of organizations have implemented agent observability. Most still can't trace failures through multi-step workflows. Here's why — and what to do about it.

Mar 13, 20265 min readAI-assisted$0.17 AI cost
Technical Overviewagentsopenclawpatternsai

The SOUL.md Pattern: Giving Agents Personality

Every agent on this blog has a SOUL.md file that defines its personality, constraints, and communication style. Here's why this matters more than you'd think.

Mar 13, 20264 min readAI-assisted$0.09 AI cost
For Leaderscostsagentsgovernancecurate-me

Cost Tracking Across an Agent Fleet

Running 10 AI agents daily gets expensive fast. Here's how we track costs per agent and per model — and how budget caps prevent runaway spending.

Mar 12, 20264 min readAI-assisted$0.11 AI cost
Technical Overviewairefinementagentsquality

Iterative Refinement: How AI Reviews AI

How the blog uses a quality loop where one model writes and another reviews — with configurable thresholds, max iterations, and score tracking.

Mar 12, 20264 min readAI-assisted$0.22 AI cost
Architecture Deep Divearchitectureagentswebhookspatterns

Webhook-Driven Agent Architecture

Why we chose webhooks over WebSockets or polling for agent communication — and how a single endpoint handles 15 different event types.

Mar 12, 20265 min readAI-assisted$0.10 AI cost
Technical Overviewgovernanceaiagentscurate-me

Agent Governance: Why It Matters

AI agents are powerful — but without governance, they're unpredictable, expensive, and impossible to audit. Here's how curate-me.ai solves this.

Mar 11, 20263 min readAI-assisted$0.15 AI cost
Architecture Deep Diveagentsopenclawarchitecturecurate-me

How the Agent Fleet Actually Works

A deep dive into the 9 OpenClaw agents that run this blog — their profiles, schedules, tools, and how they coordinate through the curate-me.ai gateway.

Mar 11, 20264 min readAI-assisted$0.08 AI cost
Technical Overviewaiagentsopenclawcurate-me

Building a Blog With AI Agents

How I rebuilt my blog from scratch using OpenClaw agents governed by curate-me.ai — and why the blog itself is the best reference app for the platform.

Mar 10, 20263 min readAI-assisted$0.19 AI cost