1 · Row count
PASSEvery record made it across — no drops, no dupes.
The one-pager explains the vision. This page is the receipt. We seeded a real ClickHouse source with 50,000 web-analytics events, dispatched an Airbyte-contract worker into a live islo sandbox, moved every row into a DuckDB warehouse, and verified the result four independent ways — down to a byte-exact SHA-256 of the moved data. Crabbox is the dispatcher boundary; islo is the sandbox provider used for this executed run.
Sync succeeded · 50,000 / 50,000 rows · 4 / 4 parity checks passed · exit 0
A standard pipe: an OLAP system of record on the left, an analytics warehouse on the right, and the Airbyte source→destination record contract in the middle. The whole thing executes inside an islo sandbox — the harness never touches the data. In the generalized shape, Crabbox can target other providers behind the same run contract.
ClickHouse
etl.py worker
DuckDB
all of the above runs as one governed job: Crabbox dispatches the run contract, islo provides the sandbox, the repo worker moves the data, and the harness only reads the evidence below.
The worker follows the same contract Airbyte uses — discover, read, write, verify — on the lightweight custom-CDK path so the whole thing runs self-contained in a borrowed box. Here it is, condensed from the real file.
# 1 · DISCOVER — ask the source for its schema, map types for the destination cols = describe(f"{db}.{stream}") # ClickHouse DESCRIBE → catalog ddl = ", ".join(map_type(c) for c in cols) # CH type → DuckDB type (typed, not stringly) # 2 · READ — pull RECORD batches as JSONEachRow (exactly Airbyte's record shape) for offset in range(0, n_rows, BATCH): # 10 batches × 5,000 page = ch(f"SELECT … LIMIT {BATCH} OFFSET {offset} FORMAT JSONEachRow") # 3 · WRITE — vectorized bulk load into the destination warehouse duck.execute(f"INSERT INTO raw.events SELECT * FROM read_json({page})") # 4 · VERIFY — compare src vs dst four ways; exit non-zero unless ALL pass assert src_count == dst_count # row count assert src_revenue == dst_revenue # decimal aggregate assert src_tally == dst_tally # per-type group-by assert sha256(src_rows) == sha256(dst_rows) # ← byte-exact
Condensed for reading — the real, runnable source is worker/etl.py.
This run uses the Airbyte source→destination contract on the custom-CDK path, not a full packaged connector deployment — that's what lets it run self-contained in a sandbox in under a second. What it does prove is the part that matters for agentic data movement: a goal-driven worker can be dispatched into an isolated box, move real typed data end-to-end, and return evidence strong enough (a byte-exact checksum) for a harness to trust the result and decide what to do next.
“It ran” isn't proof. These four checks compare the source and the destination after the move. The worker exits non-zero unless all of them pass. Here is the real output of this run.
Every record made it across — no drops, no dupes.
Decimal aggregate is preserved exactly across the type boundary.
Group-by counts agree for all five event types.
Byte-exact: hash of every (event_id, type, revenue) tuple, sorted. The strongest check.
After the load, the worker queries DuckDB directly — if these return sensible business answers, the moved data is genuinely usable, not just byte-counted.
The worker reads the source catalog and maps each ClickHouse type to a DuckDB type before loading — decimals stay decimals, datetimes stay timestamps.
| column | ClickHouse · source | DuckDB · destination |
|---|---|---|
| event_id | UInt64 | BIGINT |
| user_id | UInt32 | BIGINT |
| session_id | UInt32 | BIGINT |
| event_type | LowCardinality(String) | VARCHAR |
| channel | LowCardinality(String) | VARCHAR |
| device | LowCardinality(String) | VARCHAR |
| country | LowCardinality(String) | VARCHAR |
| url | String | VARCHAR |
| revenue | Decimal(12, 2) | DECIMAL(18,2) |
| ts | DateTime | TIMESTAMP |
This wasn't your laptop. The job ran in a fresh, repo-defined islo microVM in the Canada compute region, then was torn down. Every field below is captured from the live box.
| sandbox id | 019ea238-1f81-7950-80a9-1b80a5e0b556 |
| sandbox name | airbyte-etl |
| image | docker.io/library/python:3.12 |
| kernel | Linux 6.16.9+ · x86_64 |
| vCPU / memory | 4 vCPU · 3930 MB |
| compute region | ca.compute.islo.dev (Canada) |
| public owner | redacted |
| source engine | ClickHouse 26.6.1.472 |
| destination engine | DuckDB 1.5.3 |
| runtime | Python 3.12.13 |
| sync mode | full_refresh_overwrite |
| bytes read | 9,450,617 (~9.0 MB) |
| batches | 10 × 5,000 records |
| read / write split | 0.186s read · 0.145s write |
Nine ::CRABBOX_PHASE:: markers split the job into named steps the orchestrator can time, attach evidence to, and reason over. This is the end-to-end, start to finish.
::CRABBOX_PHASE::bootstrapInstall the ClickHouse static binary and a Python venv (duckdb, requests) into the fresh box.
::CRABBOX_PHASE::boot_clickhouseStart a local ClickHouse server and wait until it answers on the HTTP port.
::CRABBOX_PHASE::seedGenerate 50,000 deterministic web-analytics events (fixed RNG) and insert them — this is the system of record we're asked to move.
::CRABBOX_PHASE::discoverRead the source catalog and map every ClickHouse type to a DuckDB type before a single row moves.
::CRABBOX_PHASE::write_setupCreate the typed destination table in DuckDB from the mapped catalog — drop and recreate, since the sync mode is full_refresh_overwrite.
::CRABBOX_PHASE::syncStream the rows out as JSONEachRow RECORD batches and bulk-load them into the destination warehouse.
::CRABBOX_PHASE::verifyCompare source and destination four ways, including a byte-exact SHA-256. The worker exits non-zero unless all four agree.
::CRABBOX_PHASE::analyticsQuery the destination directly (top countries, funnel) to prove the moved data is usable, not just present.
::CRABBOX_PHASE::emitWrite metrics.json plus a final STATE message — the structured evidence a reflection loop ingests to decide the next move.
::CRABBOX_PHASE::bootstrap [e2e] installing clickhouse static binary ::CRABBOX_PHASE::boot_clickhouse [e2e] ClickHouse up: 26.6.1.472 ::CRABBOX_PHASE::seed [seed] analytics.events ready: rows=50000 total_revenue=611815.02 ::CRABBOX_PHASE::discover {"type":"LOG","log":{"level":"INFO","message":"discovered stream 'events'"}} {"type":"LOG","log":{"level":"INFO","message":"source has 50000 records, total_revenue=611815.02"}} ::CRABBOX_PHASE::write_setup ::CRABBOX_PHASE::sync {"type":"LOG","log":{"level":"INFO","message":"synced 5000/50000 records"}} … 10 batches … {"type":"LOG","log":{"level":"INFO","message":"synced 50000/50000 records"}} ::CRABBOX_PHASE::verify ::CRABBOX_PHASE::analytics ::CRABBOX_PHASE::emit {"type":"STATE","state":{"records_moved":50000,"status":"SUCCEEDED"}} {"type":"LOG","log":{"level":"INFO","message":"sync SUCCEEDED: moved 50000 rows in 0.549s (150700.2 rows/s); checks_passed=True"}} EXIT=0
Full artifacts in the repo: metrics.sandbox.json · e2e_sandbox.log
The seed is deterministic (fixed RNG), so the SHA-256 above is reproducible. Borrow a box, hydrate it from the repo, run the proof, tear it down.
Lease a box, hydrate from the repo, run the proof.
# lease a box, hydrate from the repo, run the proof islo use airbyte-etl \ --config poc/islo.yaml \ --source github://zozo123/agentic-airbyte \ -- bash poc/run_e2e.sh
Dispatch it as a governed run with evidence capture.
# dispatch as a governed run with evidence capture crabbox run --pool org/data-movement/main/... \ --shell 'bash poc/run_e2e.sh' \ --artifact-glob 'poc/reports/**' \ --junit poc/reports/
source — run_e2e.sh · worker/etl.py · worker/seed.py · islo.yaml