Run and evaluate AI agents
without managing infrastructure.

Ship workflows that execute, eval and iterate until they’re right.

Get started free Download the Jetty skill

What builders are shipping with Jetty

Automated code improvements

Cost-saving by connecting traces, get ready-to-merge PRs.

Code evaluation and benchmarking

Managed eval batches that hill-climb toward a quality bar.

Document processing

Extraction and transformation pipelines with built-in quality gates.

Design consistency

Audit a codebase or assets against a design system.

Anatomy of a runbook

A runbook is a spec for your AI agent.

One portable Markdown file: your agent’s job, what “done” looks like, and how it checks its own work before finishing.

Objective and output manifest

What the agent is doing and the exact files it must produce. The run isn't complete until every file exists.

Evaluation and iteration

How the agent checks its own work — rubric scoring or programmatic validation. If it fails, it retries with bounded iteration (typically 3 rounds).

YAML frontmatter

Version, evaluation strategy (programmatic or rubric), agent, model, and snapshot environment.

Parameters and dependencies

Template variables injected at runtime, plus tools and skills the agent needs. The runtime checks availability before execution.

Steps

Sequential plain-language instructions. Each step can run code, call tools, or invoke skills.

RUNBOOK-etl-pipeline.md

---
version: "1.0.0"
evaluation: rubric
agent: claude-code
model: claude-sonnet-4-6
snapshot: python312-uv
---

# ETL Pipeline Agent

## Objective
Fetch new events, enrich each record with
AI-generated summaries, and persist results
to the output table.

## REQUIRED OUTPUT FILES
| {{results_dir}}/validation_report.json |
| {{results_dir}}/summary.md |
| {{results_dir}}/enriched_events.csv |

## Parameters
| source_table | raw.events | Input |
| results_dir  | /app/results | Output |
| batch_size   | 100 | Records per batch |

## Step 1: Fetch Records
Query source_table for new rows since the
last checkpoint. Process in batches.

## Step 2: Enrich
For each batch, invoke the Summarizer skill
to generate a summary for each record.

## Step 3: Write Results
Persist enriched records to enriched_events.csv.

## Evaluation
| # | Criterion | 5 (Pass) | 1 (Fail) |
| 1 | Completeness | All rows enriched | Missing rows |
| 2 | Quality | Summaries coherent | Gibberish |
| 3 | Schema | Valid CSV output | Malformed |

Pass if score ≥ 4.0, no criterion below 3.

## Iteration
If evaluation fails, retry with bounded
iteration. Max 3 rounds.

Build runbooks using Claude Code

Run these in your terminal to install the Jetty skill:

Full instructions

$ claude plugin marketplace add jettyio/jettyio-skills

$ claude plugin install jetty@jetty

Then run this command inside Claude Code:

/jetty-setup

Build repeatable tasks with Jetty.

Jetty authoring agent

Helps you describe and shape your task

✓Task created · ready to run

Run and evaluate AI agents
without managing infrastructure.

What builders are shipping with Jetty

Automated code improvements

Code evaluation and benchmarking

Document processing

Design consistency

A runbook is a spec for your AI agent.

Build runbooks using Claude Code

Build repeatable tasks with Jetty.

Spec it

Run it

Eval it

Get it

Run and evaluate AI agentswithout managing infrastructure.

What builders are shipping with Jetty

Automated code improvements

Code evaluation and benchmarking

Document processing

Design consistency

A runbook is a spec for your AI agent.

Build runbooks using Claude Code

Build repeatable tasks with Jetty.

Spec it

Run it

Eval it

Get it

Run and evaluate AI agents
without managing infrastructure.