What's new in Flapjack.
Claude Opus 4.7 available
Anthropic released Opus 4.7 today. It's now selectable as `claude-opus-4-7` alongside Opus 4.6, which remains supported.
Tool kind is now editable
You can now change a tool's kind (webhook, custom, delegated) after creation via the API. No more asking us to fix it in the database.
Full tool CRUD in the SDK
listTools, getTool, createTool, updateTool, deleteTool — manage your custom tools programmatically. @maats/flapjack 0.3.2.
Self-service domain allowlisting
Custom tools and delegated configs no longer require us to update a global allowlist. Manage your own allowed domains in Settings — add a domain once and all your agents can connect to it.
Empty responses are no longer a mystery
If the LLM returns nothing — no text, no tool calls — we retry once automatically. Still empty? You get an EMPTY_RESPONSE error with diagnostics instead of silence.
First message actually reaches the LLM
Messages with special characters (like box-drawing chars) were silently dropped on the first turn of a thread. Fixed a serialization issue in how we pass content to the runtime.
Sandboxes that don't vanish mid-build
Ephemeral compute environments now use named sandboxes with no auto-timeout. Your agent's cloned repos survive between tool rounds. If a sandbox does die, the agent gets told to re-clone instead of silently failing.
Choose your sandbox provider
Agent compute now supports two providers: Tensorlake (default, runs alongside the agent) and Vercel Sandbox (Firecracker microVMs with reliable filesystem persistence). Pick per-agent in settings.
Cost estimates in the done event
The SSE done event now includes estimated_cost_usd in the usage object. Stop duplicating our pricing table.
Model pricing corrected
Every model was wrong. Opus 4.6 was 3x too expensive, GPT-5.4 output was nearly half. All six models updated to current vendor rates.
@maats/flapjack SDK on npm
The official TypeScript SDK is live. React hooks, streaming, embeddable chat components, and full API coverage. npm install @maats/flapjack and ship.
Agent computer
Agents can execute code, read/write files, and run shell commands in sandboxed environments. Ephemeral mode with thread-scoped persistence.
Per-agent max tool rounds
Control how many tool-call rounds an agent can run before being cut off. Pairs well with cost limits.
Per-tool HMAC secrets
Each webhook tool gets its own signing secret. Your endpoints can verify payloads came from Flapjack.
Memory respects boundaries
Per-message overrides can only restrict memory, never widen. Disabling memory at the agent level can't be overridden by a sneaky SDK call.
Memory retrieval in public chats
The shared chat engine now retrieves memories with proper resource scoping. Agent, thread, and resource-scoped memories work across both API and SDK paths.
WhatsApp bridge hardened
Dynamic protocol version fetching, retry backoff on 500s, session cleanup on disconnect. No more reconnect loops or leaked service URLs in logs.
Consensus suggestions
Multi-instance suggestion generation with consensus voting. Agents and runners can surface high-confidence prompt improvement suggestions.
Prompt skeleton templates
Model-optimized starter templates for agent preambles. Pick a skeleton, fill in the blanks, get a well-structured prompt.
Runner budgets
Set a USD budget per runner run. Hit the limit and the run pauses or fails — your choice. Per-run overrides for when you need to splurge.
WhatsApp channel
Connect agents to WhatsApp via QR code pairing. Full bridge with message formatting, chunking, and webhook-based response collection. Self-hosted bridge service included.
GitHub poll triggers
Runners can now trigger on GitHub events — new PRs, pushes, and other webhook events. Poll-based executor with idempotency keys.
MCP Connect
Connect any MCP server to your agents — stdio or streamable HTTP. Full OAuth 2.1 support, credential management, tool namespacing, and a registry browser.
CI/CD pipeline
Auto-deploy Supabase migrations and Tensorlake runtime on push to main. No more manual deploys.
LLM errors surfaced cleanly
Quota hits, auth failures, and model-not-found errors now surface with clear codes instead of generic TENSORLAKE_FUNCTION_FAILED.
Updated model catalog
Latest GPT-5.4 and Claude 4.6 models available. Removed deprecated models and stale pricing.
Knowledge / RAG
Upload documents, embed them with pgvector, and retrieve relevant chunks at query time. Your agents now have context beyond the conversation.
Database integrations
Connect Postgres or Supabase to your agents. Schema introspection, scoped table access, encrypted credentials. Agents query your data without you writing SQL endpoints.
Sign-up and org creation
New sign-up flow with automatic org provisioning. One click and you're in.
Agent dashboard
Create, configure, and test agents from the dashboard. Built-in chat tester so you can iterate without leaving the browser.
We don't remember a time before this.