Free & Open Source · Zero Config · 93+ Real Measurements

Your LLM code is
secretly wasting energy

One line of code can waste 147% more energy. Install our GitHub Bot and it auto-audits every PR — catching issues that even experienced engineers miss.

Install on GitHub See how it works

147%

Energy Wasted

by default load_in_8bit=True

95.7%

GPU Power Wasted

when using batch_size=1

93+

Real Measurements

across 3 NVIDIA GPU architectures

The problem no one talks about

Common quantization advice is wrong. We measured it.

✗ What most people write

# "INT8 saves energy, right?"
config = BitsAndBytesConfig(
load_in_8bit=True,
)
# "Just process one at a time"
for prompt in prompts:
model.generate(prompt)

+147% energy · -76% throughput · +95.7% waste

✓ What the Bot recommends

# Pure INT8 — no mixed-precision overhead
config = BitsAndBytesConfig(
load_in_8bit=True,
llm_int8_threshold=0.0,  # key fix
)
# Batched inference
outputs = model.generate(batch)

Saves energy · +84% throughput · Data-backed

3 steps. 60 seconds.

Zero configuration. No code changes needed.

Install the Bot

Click "Install" on the GitHub App page. Choose which repos to audit.

Open a Pull Request

Push code with LLM quantization configs (BitsAndBytesConfig, etc.) and create a PR as usual.

Get Instant Audit

The Bot scans the diff, flags energy waste patterns, and posts a comment with data-backed fixes.

What the Bot looks like in action

Automatic PR comment with prioritized issues and fixes

ecocompute-energy-auditor bot

just now

⚡ EcoCompute Energy Audit

Scanned 1 Python file(s) in this PR. 1 critical issue(s) found.

🔴 Critical: Default INT8 (bitsandbytes mixed-precision)

load_in_8bit=True without llm_int8_threshold=0.0 causes 17–147% energy waste due to INT8↔FP16 type conversion at every linear layer.

Fix: Add llm_int8_threshold=0.0

🟡 Warning: Sequential single-request processing (BS=1)

Processing prompts in a loop wastes up to 95.7% energy. Use batched inference or vLLM continuous batching.

📊 Based on 93+ measurements across RTX 4090D / A800 / RTX 5090 · View full data

Need CI/CD integration?

Use our GitHub Action for deeper pipeline control — hardware detection, baseline calibration, and energy regression gating.

GitHub Bot

One-click install, automatic PR audit

✓ Zero config
✓ 6 detection rules
✓ Auto PR comments
✓ Free forever

Install Bot — Free

⚡

GitHub Action

CI/CD pipeline, team-level control

✓ All Bot features
✓ GPU hardware detection
✓ Baseline calibration
✓ CI gate (pass/fail)

View on GitHub

What it detects

6 energy waste patterns, backed by real GPU measurements

Default INT8 config

17–147% energy waste

Mixed precision conflicts

Unpredictable waste

⚠

NF4 on small models (<3B)

~29% energy penalty

⚠

Batch size = 1 loops

Up to 95.7% waste

Missing device_map

Potential waste

Redundant quant params

Code quality

Frequently Asked Questions

Is it free?

100% free. Open source project, open data, open bot. No premium tier, no catch.

Is my code safe?

The Bot only reads PR diffs (code changes), never the full repo. No code is stored. Everything is discarded after the audit. The Bot is fully open-source — audit our code.

Will it modify my code?

No. The Bot only posts comments. It will never push commits or change your code. You decide which suggestions to adopt.

Will it spam my PRs?

No. The Bot is completely silent when: (1) PR has no Python files, (2) Python files have no quantization code, (3) your config is correct. It only speaks when it finds a real issue.

Can I install it on an Organization?

Yes. Choose your Organization account during installation. Requires org admin permissions.

How do I uninstall?

Go to github.com/settings/installations, find ecocompute-energy-auditor, click Configure → Uninstall.

Where does the data come from?

93+ empirical measurements on RTX 4090D, A800, and RTX 5090. NVML 10Hz power sampling, n=10 per config, CV<2%. Explore the full dataset.

Stop guessing.
Start measuring.

Install the Bot in 60 seconds. Next time you open a PR, you'll know exactly where your energy goes.

Install on GitHub — It's Free

Works with any public or private repo · Supports GitHub Organizations

Your LLM code is secretly wasting energy