Published Runs
7
Active Task
Open-ended progress
Scoring
Agency score
Assets
Videos and JSON
All Published Runs
| Run | Harness | Model | Problem | Agency | Raw | Survival | Level | Last Milestone | Elapsed | Frames | Clips | Status |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Gemini 3 Flash via api - Open-ended progress 20260507_164917_1_api_google_gemini-3-flash-preview_03_open_ended_progress |
api | Gemini 3 Flash | Open-ended progress | 4 | 7 | 0 | L2 | wood 100 | 28802s | 4814 | 1 | stalled |
| Claude Opus 4.7 via api - Open-ended progress 20260507_164917_1_api_anthropic_claude-opus-4.7_03_open_ended_progress |
api | Claude Opus 4.7 | Open-ended progress | 0 | 4 | 0 | L0 | inventory opened | 28802s | 3326 | 1 | invalid |
| Gemini 3.1 Pro via api - Open-ended progress 20260507_164917_1_api_google_gemini-3.1-pro-preview_03_open_ended_progress |
api | Gemini 3.1 Pro | Open-ended progress | 3 | 5 | 0 | L2 | wood 25 | 28802s | 3734 | 1 | stalled |
| GPT-5.5 via api - Open-ended progress 20260507_164917_1_api_openai_gpt-5.5_03_open_ended_progress |
api | GPT-5.5 | Open-ended progress | 0 | 3 | 0 | L0 | held pickaxe | 28802s | 3080 | 1 | invalid |
| Kimi K2.6 via api - Open-ended progress 20260507_164917_1_api_moonshotai_kimi-k2.6_03_open_ended_progress |
api | Kimi K2.6 | Open-ended progress | 0 | 4 | 0 | L0 | inventory opened | 28802s | 1622 | 1 | invalid |
| Qwen 3.6 27B via api - Open-ended progress 20260507_164917_1_api_qwen_qwen3.6-27b_03_open_ended_progress |
api | Qwen 3.6 27B | Open-ended progress | 0 | 4 | 0 | L0 | inventory opened | 21935s | 6152 | 1 | invalid |
| Gemini 3.1 Flash Lite via api - Open-ended progress 20260507_164917_1_api_google_gemini-3.1-flash-lite-preview_03_open_ended_progress |
api | Gemini 3.1 Flash Lite | Open-ended progress | 3 | 6 | 0 | L2 | wood 25 | 21930s | 5838 | 1 | complete |