bounded ETL loop

Agent plans. Crabbox runs. Airbyte moves. Evidence decides.

The harness can be Claude Code, Codex, OpenCode, or another AI coding agent. It writes a bounded job spec. Crabbox leases the worker. Airbyte moves rows. Evidence tells the agent what to do next.

Flow Board / intent becomes evidence step 1 / plan
Agentic Airbyte execution flow Goal enters the AI harness, the harness calls Crabbox, Crabbox injects a credential profile into a worker, Airbyte moves data from source to target, and evidence returns to the harness. Goal policy AI harness writes spec chooses next action Crabbox lease + run collect artifacts Profile scoped env Worker repo + env Airbyte runs Source API / DB Evidence logs / JUnit metrics / config Target warehouse
AI HarnessReads goal and writes a bounded spec.
calls Crabbox ->
CrabboxLeases a worker and injects the named profile.
runs command ->
Worker + AirbyteReads source and writes target. Agent never sees rows.
returns evidence ->
EvidenceLogs, metrics, JUnit, redacted config.
run replay

One job, traced from request to repair.

Click a row. The main flow jumps to the same boundary.

mental model

Everything is easier when each box owns one question.

Read the system as 4 contracts. Each box gets a narrow input, owns one decision, and emits a narrow output.

Intent -> Spec

Goal becomes refs, profile, retry policy, validation, artifacts.

Spec -> Run

Spec becomes a sandboxed command with a durable run id.

Profile -> Env

Profile name becomes scoped variables inside the worker only.

Source -> Target

Connector moves rows directly. The prompt never becomes the data plane.

Worker -> Evidence

Execution becomes logs, JUnit, metrics, counts, redacted config.

Evidence -> Action

Signals become finish, retry, repair, or alert.

runnable shape

The runnable shape has 3 contracts.

A useful agent output is not prose. It is a spec contract, an execution handoff, and an evidence contract.

ai-agent-dispatch.sh
# Goal: sync CRM accounts into the warehouse safely.

crabbox pool ensure example-org/data-movement/main/provider/linux/etl \
  --min-ready 3 \
  --create -- \
  --cache-volume airbyte-etl

mkdir -p .crabbox/generated
cat > .crabbox/generated/accounts-sync.json <<'JSON'
{
  "movement": "source_to_target",
  "source_ref": "source.crm.accounts",
  "target_ref": "warehouse.analytics.accounts",
  "airbyte_connection": "accounts_sync",
  "credential_profile": "etl-warehouse",
  "allow_env": ["AIRBYTE_*", "SOURCE_*", "TARGET_*"],
  "idempotency_key": "accounts_sync:daily",
  "retry": {
    "max_attempts": 2,
    "when": ["rate_limit", "transient_network"]
  },
  "validation": ["row_count", "schema_drift", "freshness"],
  "artifacts": ["reports/**", "metrics.json", "redacted-config.json"],
  "redact": ["password", "token", "secret"]
}
JSON

crabbox run --pool example-org/data-movement/main/provider/linux/etl \
  --shell 'python -m workers.airbyte_sync --config .crabbox/generated/accounts-sync.json' \
  --allow-env 'AIRBYTE_*,SOURCE_*,TARGET_*' \
  --env-from-profile etl-warehouse \
  --artifact-glob 'reports/**,metrics.json,redacted-config.json' \
  --junit reports/

crabbox results <run-id> --json
crabbox artifacts download <run-id> --out evidence/<run-id>
failure map

First find the owner. Then read the signal.

Failures are not mysteries. They are boundary breaks. Each class tells you where to look first and what you are allowed to change.

The loop is simple because the boundaries are hard.

Agent plans. Crabbox runs. Airbyte moves. Evidence returns. Repeat only when the evidence says what changed.

Replay