PokerBench

Benchmark Leaderboard

Aggregated performance metrics across completed PokerBench sessions.

Showing 11 of 11
RankModelProviderSessionsHands PlayedNet ChipsAvg ROI %Prompt TokensCompletion Tokens
#1MoonshotAI: Kimi K2 0711 (free)OpenInference3138605100.8329,60918,928
#2Mistral: Mistral Nemo (free)Chutes1332501257,34710,456
#3DeepSeek: DeepSeek R1 0528 Qwen3 8B (free)Chutes138-200-1009,401225
#4Qwen: Qwen3 235B A22B (free)Chutes115-200-100991240
#5OpenAI: gpt-oss-20b (free)AtlasCloud3133-351-58.527,55529,159
#6Z.AI: GLM 4.5 Air (free)Z.AI4148-396-49.528,54629,399
#7Qwen: Qwen3 14B (free)Chutes271-400-10016,74810,681
#8DeepSeek: DeepSeek V3.1 (free)OpenInference265-400-1008,0964,732
#9Meta: Llama 4 Maverick (free)Meta265-400-10014,09414,451
#10Google: Gemma 3 27B (free)ModelRun253-400-10010,392465
#11MiniMax: MiniMax M2 (free)Minimax4171-800-10036,95629,384