モデル比較 leaderboard
各モデルの provider 既定の単発 bare run を 7 タスク横断で集計。 成功率・平均コスト・平均レイテンシでソートでき、右のサムネ帯はタスク別のレンダ結果。 クリックで拡大(←→ / ↑↓ で隣のモデル・タスクへ)。 effort/thinking variant や iteration chain を含む詳細は各モデル名のリンク先で。
sort
status successrender_error / no_codesubmit_missing / timeout
tasks 1 cube-with-hole 2 simple-mug 3 stepped-pyramid 4 hex-bolt 5 l-bracket 6 offset-handle-mug 7 butt-hinge 枠色 = tier2 / tier3
| # | model | 成功率 | 試行 | 平均レイテンシ短い=速い | 平均コスト短い=安い | renders / task |
|---|---|---|---|---|---|---|
| gemini 3.1 flash-lite 36 run | 100% 8/8 | 7 | | |||
|
gpt 5.4 mini 42 run | 100% 7/7 | 7 | | |||
| claude haiku 4.5 30 run | 100% 7/7 | 7 | | |||
| gemini 3.1 pro 31 run | 100% 8/8 | 7 | | |||
| gemini 3.5 flash 37 run | 100% 7/7 | 7 | | |||
|
gpt 5.3 codex 28 run | 100% 7/7 | 7 | | |||
|
gpt 5.4 44 run | 100% 7/7 | 7 | | |||
| claude sonnet 4 19 run | 100% 7/7 | 7 | | |||
| claude sonnet 4.5 19 run | 100% 7/7 | 7 | | |||
| claude opus 4.5 35 run | 100% 7/7 | 7 | | |||
| claude sonnet 4.6 44 run | 100% 7/7 | 7 | | |||
| claude opus 4.8 58 run | 100% 7/7 | 7 | | |||
| claude opus 4.7 58 run | 100% 7/7 | 7 | | |||
|
gpt 5.2 codex 28 run | 100% 7/7 | 7 | | |||
|
o3 42 run | 100% 7/7 | 7 | | |||
| claude opus 4.1 28 run | 100% 7/7 | 7 | | |||
|
gpt 5.5 44 run | 100% 7/7 | 7 | | |||
| claude opus 4 22 run | 100% 7/7 | 7 | | |||
|
gpt 5.1 codex max 28 run | 100% 7/7 | 7 | | |||
|
gpt 5 42 run | 100% 7/7 | 7 | | |||
|
gpt 5.1 codex 28 run | 100% 7/7 | 7 | | |||
|
gpt 5.1 codex mini 28 run | 100% 7/7 | 7 | | |||
|
gpt 5 codex 27 run | 100% 7/7 | 7 | | |||
| claude fable 5 51 run | 100% 7/7 | 7 | | |||
| qwen3-8b 1 run · 1/7 task | 100% 1/1 | 1 | — | · · · · · · | ||
| gemini 2.5 flash-lite 34 run | 88% 7/8 | 7 | | |||
| gemini 3 flash 38 run | 88% 7/8 | 7 | | |||
| gemini 2.5 flash 36 run | 88% 7/8 | 7 | | |||
| gemini 2.5 pro 29 run | 88% 7/8 | 7 | | |||
|
gpt 5.4 nano 42 run | 86% 6/7 | 7 | × | |||
|
gpt 4.1 28 run | 86% 6/7 | 7 | × | |||
|
gpt 5 nano 42 run | 86% 6/7 | 7 | × | |||
|
gpt 5 mini 42 run | 86% 6/7 | 7 | × | |||
|
o4 mini 42 run | 86% 6/7 | 7 | × | |||
| google/gemma-4-e2b 9 run | 86% 6/7 | 7 | — | × | ||
|
openai/gpt-oss-20b 7 run | 86% 6/7 | 7 | — | × | ||
| google/gemma-3-27b 8 run · 6/7 task | 83% 5/6 | 6 | — | × · | ||
| google/gemma-4-e4b 9 run | 71% 5/7 | 7 | — | × × | ||
| nvidia/nemotron-3-nano-4b 7 run | 29% 2/7 | 7 | — | × × × × × | ||
| qwen3-0.6b 7 run | 0% 0/7 | 7 | — | × × × × × × × |
·
run 詳細 → ←→ モデル内のタスク移動 · ↑↓ 同一タスクのモデル移動 · Esc 閉じる