env-rosetta same Wordle env · 4 RL frameworks · 4 islo.dev sandboxes — side by side

launch reel · 1080×1080 · download mp4 · rendered by agentreel

The setup. @adithya_s_k reimplemented the same Wordle env across six RL frameworks as a Rosetta stone. We took four of those (OpenEnv, ORS, NeMo Gym, Verifiers), provisioned each in its own islo.dev sandbox in parallel, and laid them out side by side. Same env, four dialects.

loading cards…

why per-sandbox

HF Spaces is fine for static demos. Cold per-trial sandboxes are the right shape for RL training-time env hosting: K parallel rollouts that can mutate state independently, per-trial isolation, the ability to ssh in to debug a hung env.

provision in one line

islo use rosetta-openenv \
  --source github://adithya-s-k/RL_Envs_101 \
  -- bash -c 'cd envs/wordle_env/openenv && uv venv .venv && . .venv/bin/activate \
              && uv pip install -e . \
              && setsid -f uvicorn server.app:app --host 0.0.0.0 --port 8080'

not yet

Adithya's Jupyter agent env (real Python code-exec) hard-depends on e2b-code-interpreter. Swapping E2B → islo isn't mechanical — it's writing an IsloSandbox class that matches E2BSandbox.run_code, plumbed through envs/jupyter_env/<framework>/e2b_sandbox.py in all 4 frameworks. That's the real "islo replaces E2B" story and it's Tier 2 — sketched in POST.md.

credits