Reliable AI agents.
Repeatable outcomes.
Build templated runbooks using...
Install the plugin for Claude Code. Full instructions
claude plugin marketplace add jettyio/jettyio-skillsclaude plugin install jetty@jettyThen run this command to connect to Jetty:
/jetty-setupBuild repeatable tasks with Jetty.
Anatomy of a runbook
A runbook is a spec for your AI agent.
One portable Markdown file: your agent’s job, what “done” looks like, and how it checks its own work before finishing.
---
version: "1.0.0"
evaluation: rubric
agent: claude-code
model: claude-sonnet-4-6
snapshot: python312-uv
---
# ETL Pipeline Agent
## Objective
Fetch new events, enrich each record with
AI-generated summaries, and persist results
to the output table.
## REQUIRED OUTPUT FILES
| {{results_dir}}/validation_report.json |
| {{results_dir}}/summary.md |
| {{results_dir}}/enriched_events.csv |
## Parameters
| source_table | raw.events | Input |
| results_dir | /app/results | Output |
| batch_size | 100 | Records per batch |
## Step 1: Fetch Records
Query source_table for new rows since the
last checkpoint. Process in batches.
## Step 2: Enrich
For each batch, invoke the Summarizer skill
to generate a summary for each record.
## Step 3: Write Results
Persist enriched records to enriched_events.csv.
## Evaluation
| # | Criterion | 5 (Pass) | 1 (Fail) |
| 1 | Completeness | All rows enriched | Missing rows |
| 2 | Quality | Summaries coherent | Gibberish |
| 3 | Schema | Valid CSV output | Malformed |
Pass if score ≥ 4.0, no criterion below 3.
## Iteration
If evaluation fails, retry with bounded
iteration. Max 3 rounds.Featured runbooks.
Start from one that already works. Open a runbook to read what it does, see its example outputs, and make it your own.
Generate presentation decks grounded in real GitHub projects, or walk through a structured brief-to-slides process.
Extract text, tables, and key fields from PDFs into clean, structured JSON.
Turn a spec into a production-ready Cloudflare Durable Object in TypeScript, self-reviewed against the gotchas.
Run the same prompt through two or more models side by side and score the results.
Crawl a site, score each page against on-page SEO checks, and rank the fixes by impact.
QA a described user flow against a live web app — the agent drives a real browser and hands back a report plus a replay script.
Bring a job from your week.
By the end, you’ll have a repeatable task you can run a thousand times.