Updates
Supplemental measurements and analyses extending the LLM energy benchmark paper: new GPUs, precisions, and protocols. Subscribe via RSS.
2026-06-08 · End-to-End Baseline
RTX PRO 6000 Blackwell FP16 End-to-End Baseline
FP16 end-to-end baseline for Qwen2.5-3B at 256 and 512 generated tokens, reported in J/1k tokens with throughput and power.
2026-06-03 · Supplementary Case Study
RTX PRO 6000 Blackwell: Phase-Separated Energy Profiling
Phase-separated prefill/decode profiling, highlighting backend-architecture interaction under bitsandbytes 0.49.2.
2026-04-18 · Supplemental Update
Qwen2.5-3B on Tesla T4
Qwen2.5-3B on Tesla T4 (FP16 vs NF4), including Figure 5 summary and Table 8 results.