April’s feed had one loud pattern: the model stopped being the whole product.
The work moved outward into harnesses, skills, memory, context routing, browser control, eval loops, managed execution, and boring scripts that keep deterministic work out of frontier-model calls.
OpenAI Agents SDK, Codex, Claude Code routines, Gemini CLI subagents, Pi, Hermes, GBrain, browser-harness, Meta-Harness, HyperFrames, Firecrawl, Agent CI, Passmark, and SKILL.md workflows all pointed at the same thing: the wrapper became architecture.
The thesis
- The model is one component. The product is weights plus prompts plus tools plus memory plus compaction plus permissions plus evals plus provider quirks.
- Memory is not “save markdown somewhere.” It is deciding what enters context, when, with what freshness and relationship graph.
- Skills are workflow packaging. A good skill is a solved path compressed into reusable instructions, scripts, and tool boundaries.
- HTML is becoming an agent-native artifact format. Reports, slides, implementation notes, videos, and UI previews should be inspectable by both humans and agents.
- The money is less “build another AI app” and more “install the work graph”: migrations, review, annotation, maintenance, GTM ops, internal agents, domain-specific workflows.
1. The harness became the product
Harness beats model.
Two tools can call similar models and feel completely different. The system is model plus context, tools, prompts, skills, memory, caching, compaction, permissions, and provider behavior. “Model version” is fake precision when the surrounding stack keeps changing.
Signals:
- OpenAI Agents SDK made agents, handoffs, tracing, long-running agents, sandboxes, and memory explicit.
- Meta-Harness framed harness improvement as an optimization loop over code, traces, and scores.
- browser-harness treated Chrome automation as a self-healing CDP harness instead of a brittle browser script.
- Agent CI turned local GitHub Actions into a loop agents can execute against.
- Passmark pushed natural-language regression tests, model assertions, caching, and telemetry around Playwright.
Evals are the new training data. You update the harness from traces, run evals, tune prompts/tools/context, validate, repeat. “Agentic coding is ML” landed: generated code is a black-box artifact, specs and tests are the objective, evals are the validation set, the harness is the search process.
2. Managed agents got real
Claude Managed Agents: brain/hands/session split, credentials outside the sandbox, durable event log, disposable Linux containers, OpenTelemetry, $0.08/session-hour, reported median time-to-first-byte drop of 60%.
Codex moved toward a universal dev app: browser use, computer use, multi-terminal, SSH/devboxes, docs and PDFs, memory, plugins, automations. Chronicle-style memory provides recent screen context without repeating what you were doing.
Claude Code routines, Codex automations, pinned threads, scheduled work, heartbeats all converged: persistent agents need persistent state and a work surface, not just a chat transcript.
Shopify AI Toolkit and Cloudflare’s Agent Lee made enterprise concrete. Agents getting write access to products, orders, inventory, SEO, images, Workers, R2, DNS, error summaries. Protocol and permission layers matter more than UI.
3. Memory is context routing
- markdown is a good interface, not the whole memory system
- compaction is the memory write path
- graph is the final boss
- file search breaks when the agent does not know it should search
- proactive injection is the hard part
GBrain, LLMwiki, Rowboat, Obsidian AI tools, and Hermes memory workflows attacked the same problem: turn raw tweets, chats, notes, decisions, and project history into recall that enters context at the right time.
Memory is not storage. Memory is deciding what enters context when.
Markdown remains the right human-facing substrate: git-backed, diffable, greppable, exportable. The runtime becomes graph/RAG with temporal validity, people/project edges, explicit forgetting.
4. Tools worth opening
- Hermes Agent - open agent stack with skills, tools, memory, phone access, cost visibility.
- GBrain - personal total-recall layer around OpenClaw/Hermes workflows.
- Cabinet - open-source startup OS with agents, schedules, KB, browser terminal, local-first.
- browser-harness - self-healing browser harness for Claude Code/Codex-style work.
- HyperFrames - agent-native HTML to MP4.
- Firecrawl web-agent and Firecrawl Parse - web/PDF ingestion for agents.
- Agent CI - local GitHub Actions for agents.
- awesome-design-md - brand/design systems as markdown for agents.
- Syncthing - boring peer-to-peer file sync, relevant because durable local files beat trapped SaaS databases.
The meta-tool was SKILL.md: portable workflow packaging.
5. Workflows to steal
- Database performance loop: seed repro data, optimize query, try 10 indexes, measure impact.
- Spec-first agent work: review
SPEC.mdboundaries, let agents black-box behind them, review contracts instead of every line. - Skillification loop: do the work once, turn the solved path into a skill.
- HTML implementation notes: keep
implementation-notes.htmlwhile working. Decisions, tradeoffs, gaps stay readable. - Deterministic first: a cron job plus one LLM API call replaces most “agents.” Recurring workflows as code, not token burn.
- Browser loop: agent sees state, acts in Chrome, verifies result. A work surface, not just tests.
6. Money and distribution
HireCade: $22M ARR, 95% margins, 5 people, no funding, 14 months. Mostly annotation services, 30% service fee, large candidate database, long enterprise sales cycle.
The overlooked wedge: AI labs and AI-heavy companies need verification, review, annotation, migration, maintenance, and domain-specific human loops. “AI adoption consulting at $200+/hr” is not a meme when you actually install workflows.
Enterprise AI money is the ugly 90%: migrations, broken internal systems, compliance, support, data cleanup, job queues, search, docs, evals, permissions.
Distribution lesson: ship, share, repeat. Code plus content plus product storytelling beats separate marketing.
Uncomfortable take
Most AI adoption is not blocked by model quality. It is blocked by missing work graphs.
Companies need connected tools, permissions, memory, evals, workflow defaults, review gates, and people willing to redesign the work.
The model is powerful. The harness decides whether that power turns into work.